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Abstract In this study we consider limit theorems for microscopic stochastic models of neural fields. 
We show that the Wilson-Cowan equation can be obtained as the limit in probability on compacts for 
a sequence of microscopic models when the number of neuron populations distributed in space and 
the number of neurons per population tend to infinity. Though the latter divergence is not necessary. 
This result also allows to obtain limits for qualitatively different stochastic convergence concepts, 
e.g., convergence in the mean. Further, we present a central limit theorem for the martingale part 
of the microscopic models which, suitably rescaled, converges to a centered Gaussian process with 
independent increments. These two results provide the basis for presenting the neural field Langevin 
equation, a stochastic differential equation taking values in a Hilbert space, which is the infinite- 
dimensional analogue of the Chemical Langevin Equation in the present setting. On a technical 
level we apply recently developed law of large numbers and central limit theorems for piecewise 
deterministic processes taking values in Hilbert spaces to a master equation formulation of stochastic 
neuronal network models. These theorems are valid for processes taking values in Hilbert spaces and 
by this are able to incorporate spatial structures of the underlying model. 
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1 Introduction 

The present study is concerned with the derivation and justification of neural field equations from 
finite size stochastic particle models, i.e., stochastic models for the behaviour of individual neu- 
rons distributed in finitely many populations, in terms of mathematically precise probabilistic limit 
theorems. We illustrate this approach with the example of the Wilson-Cowan equation 

TO{t,x) = -u{t,x) + f(^l^'w{x,y)u{t,y)dy + I{t,x)) . (LI) 



We focus on the following two aspects: 
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(A) Often one wants to study deterministic equations such as equation (|l.ip in order to obtain 
results on the 'behaviour in the mean' of an intrinsically stochastic system. Thus we first discuss 
limit theorems of the law of large numbers type for the limit of infinitely many particles. These 
theorems connect the trajectories of the stochastic particle models to the deterministic solution 
of mean field equations and hence provide a justification studying equation (|l.ip in order to 
infer on the behaviour of the stochastic system. 

(B) Secondly, we aim to characterise the internal noise structure of the complex discrete stochastic 
models as in the limit of large numbers of neurons the noise is expected to be close to a simpler 
stochastic process. Ultimately, this yields a stochastic neural field model in terms of a stochastic 
evolution equation conceptually analogous to the Chemical Langevin Equation. The Chemical 
Langevin Equation is widely used in the study of chemical reactions networks for which the 
stochastic effects cannot be neglected but a numerical or analytical study of the exact discrete 
model is not possible due to its inherent complexity. 

In this study we understand as a microscopic model a description as a stochastic process, usually 
a Markov chain model, also called a master equation formulation (cf. [3l[5l[8ll9l[22] containing various 
master equation formulations of neural dynamics). In contrast, a macroscopic model is a deterministic 
evolution equation such as (|l.ip . Deterministic mean field equations have been used widely and for 
a long time to model and analyse large scale behaviour of the brain. In their original deterministic 
form they are successfully used to model geometric visual hallucinations, orientation tuning in the 
visual cortex and wave propagation in cortical slices to mention only a few applications. We refer to 
[7] for a recent review and an extensive list of references. The derivation of these equations is based 
on a number of arguments from statistical physics and for a long time a justification from micro- 
scopic models has not been available. The interest in deriving mean field equations from stochastic 
microscopic model has been revived recently as it contains the possibility to derive deterministic 
'corrections' to the mean field equations, also called second order approximations. These corrections 
might account for the inherent stochasticity and thus incorporate so called finite size effects. This 
has been achieved by either applying a path-integral approach to the master equation [8','9^ or by 
a van Kampen system-size expansion of the master equation [5]. In more detail, the author in the 
latter reference proposes a particular master equation for a finite number of neuron populations and 
derives the Wilson-Cowan equation as the first order approximation to the mean via employing the 
van Kampen system size expansion and then taking the continuum limit for a continuum of popula- 
tions. In keeping also the second order terms a 'stochastic' version of the mean field equation is also 
presented in the sense of coupling the first moment equation to an equation for the second moments. 

However, the van Kampen system size expansion does not give a precise mathematical connec- 
tion, as it neither quantifies the type of convergence (quality of the limit), states conditions when 
the convergence is valid nor does it allow to characterise the speed of convergence. Furthermore, par- 
ticular care has to be taken in systems possessing multiple fixed points of the macroscopic equation 
and we refer to [5 for a discussion of this aspect in the neural field setting. The limited applicability 
of the van Kampen system size expansion was already well known to van Kampen [33. Sec. 10]. In 
parallel to the work of van Kampen, T. Kurtz derived precise limit theorems connecting sequences 
of continuous time Markov chains to solutions of systems of ordinary differential equations, see the 
seminal studies [191120] or the monograph [15_. Limit theorems of that type are usually called the 
fluid limit, thermodynamic limit or hydrodynamic limit, for a review, see, e.g., [13] . 

As is thoroughly discussed in [5] establishing the connection between master equation models 
and mean field equations involves two limit procedures. First, a limit which takes the number of 
particles, in this case neurons per considered population, to infinity (thermodynamic limit), and a 
second which gives the mean field by taking the number of populations to infinity (continuum limit). 
In this 'double limit' the theorems by Kurtz describe the connection of taking the number of neurons 
per population to infinity yielding a system of ordinary differential equation, one for each population. 
Then the extension from finite to infinite dimensional state space is obtained by a continuum limit. 
This procedure corresponds to the approach in [5] . Thus taking the double limit step by step raises 
the question what happens if we first take the spatial limit and then the fluid limit, thus reversing 
the order of the limit procedures, or in the case of taking the limits simultaneously. Recently, in 
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an extension to the work of Kurtz one of the present authors and co-authors established limit 
theorems that achieve this double limit [23, thus being able to connect directly finite population 
master equation formulations to spatio-temporal limit systems, e.g., partial differential equation 
or integro-differential equations such as the Wilson-Cowan equation (|l.ip . In a general framework 
these limit theorems were derived for Piecewise Deterministic Markov Processes on Hilbert spaces 
which in addition to the jump evolution also allow for a coupled deterministic continuous evolution. 
This generality was motivated by applications to neuron membrane models consisting of microscopic 
models of the ion channels coupled to a deterministic equation for the transmembrane potential. We 
find that this generality is also advantageous for the present situation of a pure jump model as it 
allows to include time-dependent inputs. In this study we employ these theorems to achieve the aims 
(A) and (B) focussing on the example of the deterministic limit given by the Wilson-Cowan equation 

Finally, we state what this study does not contain, which in particular distinguishes the present 
study from [8l[9l[5] beyond mathematical technique. Presently, the aim is not to derive moment 
equations, i.e., a deterministic set of equations that approximate the moments of the Markovian 
particle model, but rather processes (deterministic or stochastic) to which a sequence of microscopic 
models converges under suitable conditions in a probabilistic way. This means that a microscopic 
model, which is close to the limit - presently corresponding to a large number of neurons in a large 
number of populations -, can be assumed to be close to the limiting processes in structure and 
pathwise dynamics as indicated by the quality of the stochastic limit. Hence, the present work is 
conceptually - though neither in technique nor results - close to [30 wherein using a propagation to 
chaos approach in the vicinity of neural field equations the author also derives in a mathematically 
precise way a limiting process to finite particle models. However, it is an obvious consequence that 
the convergence of the models necessarily implies a close resemblance of their moment equations. 
This provides the connection to [8l|9l[5] which we briefiy comment on in Appendix [B] 

As a guide we close this introduction with an outline of the subsequent sections and some general 
remarks on the notation employed in this study. In Sections 1 1 . 1 1 to [T73l we first discuss the two types 
of mean field models in more detail, on the one hand, the Wilson-Cowan equation as the macroscopic 
limit and, on the other hand, a master equation formulation of a stochastic neural field. The main 
results of the paper are found in Section [2] There we set up the sequence of microscopic models 
and state conditions for convergence. Limit theorems of the law of large numbers type are presented 
in Theorem 12.11 and Theorem 12.21 in Section 12.11 The first is a classical weak law of large numbers 
providing uniform convergence on compacts in probability and the second convergence in the mean 
uniformly over the whole positive time axis. Next, a central limit theorem for the martingale part 
of the microscopic models is presented in Section 12.21 characterising the internal fluctuations of the 
model to be of a diffusive nature in the limit. This part of the study is concluded in Section [2.31 bv 
presenting the Langevin approximations that arise as a result of the preceding limit theorems. The 
proofs of the theorems in Section [2] are deferred to Section U) The study is concluded in Section |3] 
with a discussion of the implications of the presented results and an extension of these limit theorems 
to different master equation formulations or mean field equations. 

Notations and conventions: Throughout the study we denote by LP{D), 1 < p < oo, the Lebesgue 
spaces of real functions on a domain D C R"^, d > 1. Physically reasonable choices are d £ {1,2,3}, 
however for the mathematical theory presented the spatial dimension can be arbitrary. In the present 
study spatial domains D are always bounded with a sufficiently smooth boundary, where the minimal 
assumption is a strong local Lipschitz condition, see [2 . For bounded domains D this condition simply 
means that for every point on the boundary its neighbourhood on the boundary is the graph of a 
Lipschitz continuous function. Furthermore, for a G N we denote by H"'{D) the Sobolev spaces, i.e., 
subspaces of L^{D), with the corresponding Sobolev norm. For a e ]R+\N we denote by H'^(D) the 
interpolating Besov spaces. In this study H~"{D) is the dual space of H"{D) which is in contrast to 
the widespread notation to denote by H~°'{D), a > 0, the dual space of Hq{D). As usual we have 
H°{D) = L^{D) = H~°{D). We thus obtain a continuous scale of Hilbert spaces H^iD), a £ R, 
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which satisfy that H"^{D) is continuously embeddec0 in H°'^{D) for all ai < 02- Next, a pairing 
( ■ , ■ )jja, denotes the inner product of the Hilbert space H°'{D) and pairings in angle brackets {■, ■) }ja 
denote the duality pairing for the Hilbert space H^{D). That is, for t/; G H°'{D) and (p e H~^{D) 
the expression {4','4>)h°' denotes the application of the real, linear functional to ip. Furthermore the 
spaces H°'{D),L^{D) and H^°'{D) form an evolution triplet, i.e., the embeddings are dense and the 
application of linear functionals and the inner product in (D) satisfy the relation 

{</>,V)ff= = V0GL"(D),V'Gi/"(I>). (1.2) 

Norms in Hilbert spaces are denoted by || ■ H/jq, || • ||o is used to denote the supremum norm of real 
functions, i.e., for / : K — >■ R we have ||/|jo = supj,gjj and | • | denotes either the absolute value 

for scalars or the Lebesgue measure for measurable subsets of Euclidean space. Finally, we use No 
to denote the set of integers including zero. 

1.1 The macroscopic limit 

Neural field equations are usually classified into two types, rate-based and activtty-based models. The 
prototype of the former is the Wilson-Cowan equation, see equation (|l.ip which we also restate below, 
and the Amari equation, see equation (|3.7p in Section |21 is the prototype of the latter. Besides being 
of a different structure, due to their derivation, the variable they describe has a completely different 
interpretation. In rate-based models the variable describes the average rate of activity at a certain 
location and time, roughly corresponding to the fraction of active neurons at a certain infinitesimal 
area. In activity-based models the macroscopic variable is an average electrical potential produced 
by neurons at a certain location. For a concise physical derivation that leads to these models we refer 
to [5] • In the following we consider rate-based equations, in particular, the classical Wilson-Cowan 
equation, to discuss the type of limit theorems we are able to obtain. We remark, that the results 
are essentially analogous for activity based models. 

Thus, the macroscopic model of interest is given by the equation 

r i>(t, x) = -v{t, x) + f{j^'w{^, y>{t, y) dy + I{t, x)) , (1.3) 

where r > is a decay time constant, / : R — >• R+ is a gain (or response) function that relates 
inputs that a neuron receives to activity. In (|1.3p the value f{z) can be interpreted as the fraction of 
neurons that receive at least threshold input. Furthermore w{x,y) is a weight function which states 
the connectivity strength of a neuron located at y to a neuron located at x and, finally, 7(t, a;) is an 
external input which is received by a neuron at x at time t. For the weight function w : D x D — >■ R 
and the external input I we assume that w G LF'{D x D) and I G C(R+, L^(_D)). As for the gain 
function / we assume in this study that / is non-negative, satisfies a global Lipschitz condition with 
constant L > 0, i.e., 

\f{a) - f{h)\< L\a-b\ Va,6GR, (1.4) 

and it is bounded. From an interpretive point-of-view it is reasonable and consistent to stipulate that 
/ is bounded by one - being a fraction - as well as being monotone. The latter property corresponds 
to the fact that higher input results in higher activity. In specific models, / is often chosen to be a 
sigmoidal function, e.g., f{z) = {l + e~^'^^''+^^)~'^ in or /(z) = (tanh(/?iz + ^32) + 1)/2 in [3_ which 
both satisfy / G [0, 1]. Moreover, the most common choices of / are even infinitely often differentiable 
with bounded derivatives, which already implies the Lipschitz condition (|1.4I) . 

The Wilson-Cowan equation (|1.3p is well-posed in the strong sense as an integral equation in 
L^{D) under the above conditions. That is, equation (|1.3p possesses a unique, continuously differ- 
entiable global solution u to every initial condition !/(0) = uq a L^{D), i.e., u G C^([0,r],L^(_D)) for 
all T > 0, which depends continuously on the initial condition. Furthermore, if the initial condition 

^ A normed space X is continuously embedded in another normed space Y, in symbols X ^ Y, ii X G Y and 
there exists a constant K < 00 such that ||tt||y < for all u £ X. 
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satisfies uo{x) G [0, ||/||o] almost everywhere in D, then it holds for all t > that iy(t,x) G (0, ||/||o) 
for almost all x € D. For a brief derivation of these results we refer to Section [XI where we also state 
a result about higher spatial regularity of the solution: Let q G N be such that a > d/2. If now 
uo G H'^{D) and if / is at least a-times differentiable with bounded derivatives and the weights and 
the input function satisfy w G H°'{D x D) and I G C(R+, -ff"(-D)), then the equation is well-posed 
in H°'{D), i.e., for all T > in G {[0,T], H°'{D)). In particular this implies that the solution u is 
jointly continuous on x D. 



1.2 Master equation formulations of neural network models 

For the microscopic model we concentrate on a variation of the model considered in [5l[6], which is 
already an improvement on a model introduced in We extend the model including variations 
among neuron populations and foremost time-dependent inputs. We chose this model over the master 
equation formulations in [8l[9] as it provides a more direct connection of the microscopic and macro- 
scopic models, see also the discussion in Section [3l We describe the main ingredients of the model 
beginning with the simpler, time-independent model as prevalent in the literature. Subsequently, in 
Section ri. 31 the final, time-dependent model is defined. 

We denote by P the number of neuron populations in the model. Further, we assume that the 
k-th neuron population consists of identical neurons which can either be in one of two possible states, 
active, i.e., emitting action potentials, and inactive, i.e., quiescent or not emitting action potentials. 
Transitions between states occur instantaneously and at random times. For all k = 1,...,P the 
random variables &t denote the number of active neurons at time t. An integer l{k) is used to 
characterise the population size. This number l{k) can be be interpreted as the number of neurons 
in the fc-th population, at least for sufficiently large values. However, this is not accurate in the 
literal sense as it is possible with positive probability for populations to contain more than l(k) 
active neurons. Nevertheless, a-posteriori the interpretation can be salvaged from the obtained limit 
theoremsQ It is a corollary of these that the probability of more then l{k) neurons being active 
for some time becomes arbitrarily small for large enough l{k). Hence, for physiological reasonable 
neuron numbers the probability in this models of observing 'non-physiological' trajectories in the 
interpretation becomes ever smaller. 

Proceeding with notation, 0t = (6>t , . . . , 6>f) is a (unbounded) piecewise constant stochastic 
process taking values in . The stochastic transitions from inactive to active states and vice versa 
for a neuron in population k are governed by a constant inactivation rate r^^ > - uniformly for 
all populations - and inputs from other neurons depending on the current network state. This non- 
negative activation rate is given by T~^l{k)f f.{6) for 6 G N^. For the definition of /j. we consider 
weights Wkj, k,j = l,...,P, which weight the input one neuron in population k receives from a 
neuron in population j. Then the activation rate of a neuron in population k is proportional to 



for a non-negative function / : M — >■ R, which obviously corresponds to the gain function / in the 
Wilson-Cowan equation (|1.3p . We remark that here / is not the rate of activation of one neuron. In 
this model the activation rate of a population is not proportional to the number of inactive neurons 
but it is proportional to l{k), which stands for the total number of neurons in the population. In [5] 
this rate is thus interpreted as the rate with which a neuron becomes or remains active. 



^ The derivation of limit theorems for bounded populations sizes, where l{k) actually is the number of neurons 
per population, is much more delicate than the subsequent presentation as the transition rate functions become 
discontinuous. Although this would be a desirable result we have not yet been able to prove such a theorem, though 
it is clear that the Wilson-Cowan equation would be the only possible limit. See also a discussion of this aspect in 
Section [321 



P 




(1.5) 
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It follows that the process {0t)t>o is a continuous-time Markov chain whose evolution is governed 
by the following master equation, where e^. denotes the fc-th basis vector of M^, 



- Yl O(^) fkio - efc) - ek,t] - {e'' + lik) fk,{0)) ne, t] + + l) ¥[9 + ek,t]] (1.6) 



dt T 

k=l 

which is endowed with the boundary conditions P[S,t] = if ^ N^. In p.6p the variable P[S,f] 
denotes the probability that the process 0t is in state 9 at time t. Finally, the definition is completed 
with stating an initial law £, the distribution of 6*o, i.e., providing an initial value for the ODE 
system (|1.6|1 . 

Another definition of a continuous-time Markov chain is via its generator, see, e.g., [15] . and it is 
equivalent to the master equation (|1.6p . Although the master equation is widely used in the physics 
and chemical reactions literature the mathematically more appropriate object for the study of a 
Markov process is its generator and the master equation is an object derived from the generator, see 
[331 Sec. V]. The generator of a Markov process is an operator defined on the space of real functions 
over the state space of the process. For the above model defined by the master equation p.6p the 
generator is given by 

Ag{e) = \{9) [ (g{0 - g{9)) f,{9, dC) (1.7) 

for all suitable g : — >■ M. For details we refer to [15 . Here, A is the total instantaneous jump rate, 
given by 

P 

m--=^Y.{^''+m7k{e)), (1.8) 

fe=i 

and defines the distribution of the waiting time until the next jump, i.e., 

p[et+, = 0t'ise[o,At]\0t = e] = e-^e^) -^^ . 

Further, the measure /i in p.7p is a Markov kernel on the state space of the process defining the 
conditional distribution of the post-jump value, i.e., 

¥[0t£ A\0t^0t-] = iJ-iOt^^A) (1.9) 

for all sets A C . In the present case for each 9 the measure fi is given by the discrete distribution 

t^{e,{e-ek}) = \^y t^{e,{9 + ek})='^&^ vfc = i,...,p. (i.io) 

The importance of the generator lies in the fact that it fully characterises a Markov process and that 
convergence of Markov processes is strongly connected to the convergence of their generators, see 



1.3 Including external time-dependent input 

Until now the microscopic model does not incorporate any time-dependent input into the system. 
In analogy to the macroscopic equation (|1.3|) this input enters into the model inside the active rate 
function /j,. Thus let /^(t) denote the external input into a neuron in population k at time t, then 
the time-dependent activation rate is given by 

p 

7k{e,t) = /(^ vFfcj +7k{t)) . (1.11) 
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The most important qualitative difference when substituting (|1.5p by (|l.lip is that the corresponding 
Markov process is no longer homogeneous. In particular the waiting time distributions in between 
jumps are no longer exponential but satisfy 



•t+s = V s G [0, At] \et = e]=e'- if' ^(^' '^^ 



Hence, the resulting process is an inhomogeneous continuous-time Markov chain, see, e.g., [36, Sec. 2]. 
It is straight forward to write down the corresponding master equation analogously to (|1.6(1 yielding 
a system of non-autonomous ordinary differential equations, cf. the master equation formulation 
in [H]. Similarly there exists the notion of a time-dependent generator for inhomogeneous Markov 
processes, cf. [151 Sec. 4.7]. Employing a standard trick, that is, suitably extending the state space of 
the process, we can transform a inhomogeneous to a homogeneous Markov process [151128] . That is, 
the space-time process Yt := {Ot,t) is again a homogeneous Markov process. The initial law of the 
associated space-time process is Cx So on x M+ . We emphasise that definitions of the space-time 
process and its initial law imply that the time-component starts at a.s. and, moreover, moves 
continuously and deterministically. That is, the trajectories satisfy in between jumps the differential 
equation 

where the jump intensity A is given by the sum of all individual time-dependent rates analogously to 
([1.8p . Finally, the post jump value is given by a Markov kernel fj,{{9,t), •) x 6t as there clearly do not 
occur jumps in the progression of time and fi is the obvious time-dependent modification of ([l.lOp . 

It thus follows, that the space-time process {Ot,t)t>o is a homogeneous Piecewise Deterministic 
Markov Process (PDMP), see, e.g., [1411161126] . This connection is particularly important as we apply 
in the course of the present study limit theorems developed for this type of processes, see [27]. Finally, 
for the space-time process {Ot,t)t>o we obtain for suitable functions g : x — >• R the generator 

Ag{e,t) = \/tg{e,t) + x{e,t) f(^g{^,t) ~ g{e,t)) f^{{e,t),d^) . (1.12) 



2 A precise formulation of the limit theorems 

In this section we present the precise formulations of the limit theorems. To this end we first define 
a suitable sequence of microscopic models which gives the connection between the defining objects 
of the Wilson-Cowan equation (|1.3p and the microscopic models discussed in Section 11.21 Thus, 
(^i"')t>o = (6'"5*)t>0j n € N, denotes a sequence of microscopic PDMP neural field models of 
the type as defined in Section [1.31 Each process (Ft")t>o is defined on a filtered probability space 
(fi", J^, {J^)t>o,^^) which satisfies the usual conditions. Hence, the defining objects for the jump 
models are now dependent on an additional index n. That is P{n) denotes the number of neuron 
populations in the n-th model, l{k,n) is the number of neurons in the fe-th population of the n-th 
model and analogously we use the notations W^j and If^ n ^^nd fkn- However, we note from the 
beginning that the decay rate is independent of n and t is the time constant in the Wilson-Cowan 
equation ([1.3p . In the following paragraphs we discuss the connection of the defining components of 
this sequence of microscopic models to the components of the macroscopic limit. 

Connection to the spatial domain D. A key step of connecting the microscopic models to the 
solution of equation ([1.3p is that we need to put the individual neuron populations into relation to the 
spatial domain D the solution of ([1.3p lives on. To this end we assume that each population is located 
within a subdomain of D and that the subdomains of the individual populations are non-overlapping. 
Hence, for each n e N we obtain a collection Vn of P{n) non-overlapping subsets of D denoted by 
Di^n, ■ ■ ■ , Dp{n),n- We assume that each subdomain is measurable and convex. The convexity of 
the subdomains is a technical condition that allows us to apply Poincare's inequality, cf. ([4.ip . We 
do not think that this condition is too restrictive as most reasonable partition domains, e.g., cubes. 
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triangles, are convex. Furthermore, for all reasonable domains D, e.g., all Jordan measurable domains, 
a sequence of convex partitions can be found such that additionally the conditions imposed in the 
limit theorems below are also satisfied. Conversely, one may think of obtaining the collection Vn by 
partitioning the domain into P{n) convex subdomains -Di.n, ■ . • ,-Dp(„)^„ and confining each neuron 
population to one subdomain. However it is not required that the union of the sets in D„ amounts to 
the full domain D nor that the partitions consists of refinements. Necessary conditions on the limiting 
behaviour of the subdomains are very strongly connected to the convergence of initial conditions of 
the models, which is a condition in the limit theorems, see below. For the sake of terminological 
simplicity we refer to Vn simply as the partitions. 

We now define some notation for parameters characterising the partitions Vn : the minimum and 
maximum Lebesgue measure, i.e., length, area or volume depending on the spatial dimension, is 
denoted by 

v-{n) := min |-D/c,ril, v+{n) := max |-Dfc,„| , (2.1) 

fc— l,...,P(n) /c— l,...,P{n) 

and the maximum diameter of the partition is denoted by 

(5-)-(n) := max diam(Z)j,„), (2-2) 

l,...,P(n) ' 

where the diameter of a set D^.n is defined as diam {D^.n) '■= sup^, y^D^ „ 1^ ^ ^1- special case 

of domains obtained by unions of cubes with edge length it obviously holds that v±{n) = n~'^ 
and (5+(n) = \/dn~^ . It is a necessary condition in all the limit theorems that limn-^oo <5+(n) = 
which implies that \vain^a2v+{n) = as well as Ynnn^oD P{n) = oo as the Lebesgue measure of a 
set is bounded in terms of the diameter of the set. That is, in order to obtain a limit the sequence 
of partitions necessarily consists of ever finer sets and the number of neuron populations has to 
diverge. Finally, each domain -Dfe.„ of the partition Vn contains one neuron population 'consisting' 
of l{k,n) £ N neurons. Then we denote by t±{n) the maximum and minimum number of neurons in 
populations corresponding to the n-th model, i.e., 

l-(n) := min l(k,n), i+(n) := max l(k,n) . (2.3) 

fc=l,...,P(n) A;=l,...,P(n) 



Connection to the weight function w. We assume that there exists a function k; : D x D — >■ M 
such that the connection to the discrete weights is given by 

where w is the same function as in the Wilson-Cowan equation (|1.3p . For the definition of activation 
rate at time £ we thus obtain 

p 

/fc,„(r,t) := /(^^TT^^.^ + 7, „(t)j . (2.5) 



Connection to the input current /. The external input which is applied to neurons in a certain 
population is obtained by spatially averaging a space-time input over the subdomain that population 
is located in, i.e., 

^fe,n(i) := / I{t,x)dx. (2.6) 

This completes the definition of the Markov jump processes (6>", t)t>o. For the sake of complete- 
ness we repeat the definition of the total jump rate 

A"(r, t):=\Y. (^''" + '(^' ^) Ik.niO-' *)) 
fe=i 
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and the transition measure /i" is defined by 
for all k= l,...P(n). 

Connection to the solution u. As functions of time, the paths of the PDMP (6'",t)t>o and 
the solution u live on different state spaces. The former takes values in x M+ and the latter in 
L^{D). Thus in order to compare these two we have to introduce a mapping that maps the stochastic 
process onto L^{D). In [27] the authors called such a mapping a coordinate function which is also 
the terminology used in [TH]. In fact, the limit theorems we subsequently present actually are for 
the processes we obtain from the composition of the coordinate functions with the PDMPs. Here 
it is important to note that for each n G N the coordinate functions may - and usually do - differ, 
however they project the process into the common space L^{D). For the mean field models we define 
the coordinate functions for all n G N by 



Clearly each u" is a measurable map into Lp'[D). For the composition of with the stochastic 
process {6>",f)t>o we also use the abbreviation z/" := ^/"(Q") and hence the resulting stochastic 
process (t'")t>o is an adapted cadlag process taking values in L^{D). This process thus states the 
activity at a location x £ D as the fraction of active neurons in the population which is located 
around this location. 

Connection of the initial conditions. One condition in the subsequent limit theorems is the 
convergence of initial conditions in probability, i.e., the assumption that 

lim P"n|//"(e?)-j.o||L^ >e] =0 Ve>0. (2.8) 

It is easy to see that such a sequence of initial conditions ©q , n G N, can be found for any deterministic 
initial condition vq under some reasonable conditions on the domain D and the sequence of partitions 
T^n - Hence the assumption (|2.8p can always be satisfied. For example, we may define such a sequence 
of initial conditions by 

Ol - = argmin,^,,...,(,,„,[^ ~ ^ ./d,.„ '^(0,^)dx| . 

Next, assuming that partitions fill the whole domain D for n — > oo, i.e., lim„->.oo \D\ UfA"^ -C'fc^ni = 0, 
and that the maximal diameter of the sets decreases to zero, i.e., lim„-j.oo <5+(n) = 0, it is easy to 
see using the Poincare inequality (|4.ip that the above definition of the initial condition implies that 
— v{<d)\\L^ — s> and sup„gf, ||i^o ||^2 < oo for all r > 1. Then (|2.8p holds trivially as the initial 
condition is deterministic and converges. A simple non-degenerate sequence of initial conditions is 
obtained by choosing random initial conditions with the above value as their mean and sufficiently 
fast decreasing fiuctuations. Furthermore, a sequence of partitions which satisfy the above conditions 
also exists for a large class of reasonable domains D. Assume that D is Jordan measurable, i.e., a 
bounded domain such that the boundary is a Lebesgue null set, and let Cn be the smallest grid of 
cubes with edge length 1/n covering D. We define Vn to be the set of all cubes which are fully in D. 
As D is Jordan measurable these partitions fill up D from inside and 5^{n) — >• 0. For a more detailed 
discussion of these aspects we refer to [26_. 

In the remainder of this section we now collect the main results of this article. We start with the 
law of large numbers, which establishes the connection to the deterministic mean field equation, and 
then proceed to central limit theorems which provide the basis for a Langevin approximation. The 
proofs of the results are deferred to Section H) 
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2.1 A law of large numbers 

The first law of large numbers takes the following form. Note that the assumptions imply that the 
number of neuron populations diverges. 

Theorem 2.1 (Law of large numbers) Let w G L^{D) x L^(D) and I € Lf^^{R+,H^{D)). Assume 
that the sequence of initial conditions converges to i^(0) in probability in the space L^{D), i.e., (|2.8p holds, 
that E"6)q'" < /(fc, n) and that 

lim 5+{n) = 0, lim l-{n) = oo (2.9) 

ri— >oo n— >oo 

holds. Then it follows that the sequence of L'^(yD) -valued jump-process {yi^)t>o converges in probability 
uniformly on compact time intervals to the solution v of the Wilson-Cowan equation (|1.3p . i.e., for all 
T,€ > it holds that 

lim F"[supt^^o.T]\Wt -t^mL^D) >e]=0. (2.10) 

Moreover, if for r > 1 the initial conditions satisfy in addition sup„gr^ E" ||!/q ||^^ < oo, then convergence 
in the r-th mean holds, i.e., for all T > 

Jirn^E"sup,e[o,T] ^ Hmh(,D) = ■ (2.11) 

Remark 2.1 The norm of the uniform convergence supjgTQ || ■ 11^2 for which we have stated con- 
vergence in probability and in the mean in Theorem 12.11 is a very strong norm on the space of 
(-D)-valued cadlag functions on [0,r]. Hence, due to continuous embeddings the result immedi- 
ately extends to weaker norms, e.g., the norms LP((0,r),L2) for all 1 < p < oo. Also for the state 
space weaker spatial norms can be chosen, e.g., i^(-D) with 1 < p < 2 or any norm on the duals 
H~°'{D) of Sobolev spaces with a > 0. If weaker norms for the state space are considered it is 
even possible to relax the conditions of Theorem 12.11 by sharpening some estimates in the proof 
of the theorem. Clearly, it is sufHcient that the initial conditions converge in probability with re- 
spect to the weaker norms. Recall that H~°'{D) denotes the dual of the Sobolev space H°'{D) and 
H°{D) = L^{D) = H~°{D). The results in the following corollary cover the whole range of a > 
and splits it into sections with weakening conditions. In particular note, that after passing to weaker 
norms the convergence does not necessitate that the neuron numbers per population diverge. How- 
ever, regarding the divergence of the neuron populations, this condition {5^{n) — >• 0) cannot be 
relaxed. 

Corollary 2.1 Let a > and set 

(drk if0<a<d/2, 
q :=l I- it a = d/2, (2.12) 
[ 1 if d/2 < a < oo. 

Further, assume that w e L'(-D) X L^{D) and I G Lf^^(R+, {D)) and that the sequence of initial 
conditions converges to v{0) in probability in the space H~°'{D), that limn->oo S^{n) = and 

f \2a/d 

lim +; ; , =0 if < a < d/2. 



lim "+("^' =0 if a = d/2, 

n~¥oc £„(n) 

lim = if d/2 < Q < oo , 



(2.13) 



£_(n) 

where 1— denotes an arbitrary positive number strictly smaller than 1. Then it holds for all T,e > that 

lim P"[suptg[o,T] H - HmH-^iD) > = 
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and for r > 1, if the additional boundedness assumptions of Theorem l2.1l are satisfied, that for allT > 

lim E"suptgro \\ut - '^{t)\\H-^iD) =0- 

Remark 2.2 We believe that fruitful and illustrative comparisons of these convergence results and 
their conditions to the results in Kotelenez [171118] and, particularly, Blount [3] can be made. Here 
we just mention that the latter author conjectured the conditions (|2.13p to be optimal for the 
convergence but was not able to prove this result in his model of chemical reactions with diffusions 
for the region a £ (0, d/2]. For our model we could achieve these rates. 

2.1.1 Infinite-time convergence 

In the law of large numbers, Theorem 12.11 and its Corollarv l2.1l we have presented results of con- 
vergence over finite time intervals. Employing a different technique, we are also able to derive a 
convergence result over the whole positive time axis motivated by a similar result in [32] . The proof 
of the following theorem is deferred to Section 14.31 Restricted to finite time intervals the subse- 
quent result is strictly weaker than Theorem 12. II However, the result is important when one wants 
to analyse the mean long time behaviour of the stochastic model via a bifurcation analysis of the 
deterministic limit as (]2.14p suggests that E"i^" is close to u^t) for all times t > for sufficiently 
large n. 

Theorem 2.2 Let a > and assume that the conditions of Corollary 12.11 are satisfied. We further 
assume that the current input function I G Lf^^(Rj^ , {D)) satisfies W^xIWl-^ L^{D)) < it 
is square integrable in H^{D) over bounded intervals, and possesses first spatial derivatives bounded for 
almost all t > in L^{D). Then it holds that 

lim sup,>o E"||/.r-Ki)llH-(D) = 0. (2.14) 
2.2 A martingale central limit theorem 

In this section we present a central limit theorem for a sequence of martingales associated with 
the jump processes A brief, heuristic discussion of the method of proof for the law of large 
numbers explains the importance of these martingales and motivates their study. In the proof of the 
law of large numbers the central argument relies on the fact that the process {iyt)t>o satisfies the 
decomposition 

u^ = v^+ f \{0^,s) I (^"(C)-^^"(©?))M"((e",s),de)ds + Mr. (2.15) 
Jo Jn^ 

Here the process (A/'")t>o is a Hilbert space-valued, square-integrable, cadlag martingale and, using 
(|2.15|1 as its definition, is given by 

Mr = ^r-^o"- tHo7,s) I (/."(o-^"(e"))M"((©",s),dc)ds. 

We have also used this representation of the process v"^ in the proof of Theorem 12.21 see Section 
14.31 We note that the Bochner integral in (|2.15[) is a.s. well defined due to bounded second moments 
of the integrand, see (|4.7p in the proof of Theorem 12.11 Now an heuristic argument to obtain the 
convergence to the solution of the Wilson-Cowan equation is the following: The initial conditions 
converge, the martingale term Af" converges to zero and the integral term in the right hand side of 
([2.15P converges to the right hand side in the Wilson-Cowan equation (]1.3p . Hence, the 'solution' i^" 
of ([2.15P converges to the solution v of the Wilson-Cowan equation ([1.3p . Now interpreting equation 
([2.15[1 as a stochastic evolution equation which is driven by the martingale (M")t>Q sheds light 
on the importance of the study of this term. Because, from this point of view the martingale part 
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in the decomposition (|2.15p contains all the stochasticity inherent in the system. Then the idea 
for deriving a Langevin or linear noise approximation is to find a stochastic non-trivial limit (in 
distribution) for the sequence of martingales and substituting heuristically this limiting martingale 
into the stochastic evolution equation. Then it is expected that this new and much less complex 
process behaves similarly to the process {iy")t>o for sufHciently large n. Deriving a suitable limit for 
(M")t>g is what we set to do next. The result can be found in Theorem 12.31 below and takes the 
form of a central limit theorem. 

First of all, what has been said so far implies the necessity of rescaling the martingale with a 
diverging sequence in order to obtain a non-trivial limit. The conditions in the law of large numbers 
imply in particular that the martingale converges uniformly in the mean square to zero, i.e., 

lim E"sup(gro J., ||A/"||i2 = 0, 

which in turn implies convergence in probability and convergence in distribution to the zero limit. 

Furthermore, in contrast to Euclidean spaces norms on infinite-dimensional spaces are usually 
not equivalent. In Corollarv l2.1l we exploited this fact as it allowed us to obtain convergence results 
under less restrictive conditions by changing to strictly weaker norms. In the formulation and proof 
of central limit theorems, the change to weaker norms even becomes an essential ingredient. It is 
often observed in the literature, see, e.g., [U llTlfTS] . that central limit theorems cannot be proven in 
the strongest norm for which the law of large numbers holds, e.g., L^{D) in the present setting, but 
only in a strictly weaker norm. Here this norm is the norm in the dual of an appropriate Sobolev 
space. Hence, from now on we consider for all n e N the processes (!^")t>o and the martingales 
(M")(>o as taking values in the space H~"{D) for an a > d, where d is the dimension of the spatial 
domain D, using the embedding of L^{D) into H^"{D). The technical significance of the restriction 
a > d is that these are the indices such that there exists an embedding H°'{D) into a H°'^{D) with 
d/2 < ai < a which is of Hilbert-Schmidt typ^ due to Maurin's Theorem and H'^^[D) is embedded 
into C{D) due to the Sobolev Embedding Theorem. These two properties are essential for the proof 
of the central limit theorem and their occurrence will be made clear subsequently. 

The limit we propose for the rescaled martingale sequence is a centred diffusion process in H~°'{D). 
That is, a centred Gaussian stochastic process {Xt)t>o taking values in H~°'{D) with independent 
increments and given covariance C{t), t>0, see, e.g., [12"25' for a discussion of Gaussian processes 
in Hilbert spaces. Such a process is uniquely defined by its covariance operator and conversely, each 
family of linear, bounded operators C{t) : H°'{D) — >• H^°'{D), t > 0, uniquely defines a diffusion 
proces^ if 

(i) each C{t) is symmetric and positive, i.e., 

= {C(t)V,0)//= and {Cit)^,(^)H^>0, 

(ii) each C{t) is of trace class, i.e., for one (and thus every) orthonormal basis (pj, j G N, in H°'{D) 
it holds that 

oo 

J2{Cit)pj,Pj)H'' <(x, (2.16) 

^ A continuous embedding of two Hilbert spaces Jf ^ Y is of Hilbert-Schmidt type if for every orthonormal 
basis (pj, j G N, of X it holds that ll'^illy < Then, more precisely, Maurin's Theorem states that for 

non-negative integers m, k the embedding of _ff™+*(i?) into H"^{D) is of Hilbert-Schmidt type for k > d/2, see [2|. 
The result was generalized to fractional order Sobolev spaces in 1351 : Let D be a bounded, strong local Lipschitz 
domain in and < «i < «2 are real numbers. Then it holds that the embedding of H°'^'^'^^^{D) into H°'^(D) 
is of Hilbert-Schmidt type. 

* Usually the covariance operator for a Hilbert space- valued process is an operator mapping from the state space 
into the state space and not into the dual, i.e., in the present situation mapping H~°'{D) into itself. Due to the 
canonical embedding of Hilbert spaces into their dual and the Riesz Representation, however, we can effortless 
change from the usual definition to ours and vice versa. Moreover, the symmetry condition thus implies due to the 
Hellinger-Toeplitz Theorem that the operator is self-adjoint and hence of trace class if and only if 112. 161 1 is satisfied. 
The choice of the presentation here is due to the fact that it is simpler to evaluate the duality pairing on H~°'{D) 
than the inner product thereon, as the former usually is just the inner product in L^{D). 
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(iii) and the family C{t), t > 0, is continuously increasing in t in the sense that the map t i— >■ 
{C{t) (f>,ip) H'^ is continuous and increasing for all 0,?/' G H"{D). 

We next define the process which will be the limit identified in the martingale central limit 
theorem via its covariance. In order to define the operator C we first define a family of linear 
operators G{v{t),t) mapping from H"{D) into the dual space H~°'{D) via the bilinear form 

{G{uit),t)4,,^)H^ = J^4>{x)(^h{t,x) + ^f(^l^w{x,y)u{t,y)dy + I{t,x)'^)^p{x)dx. (2.17) 

It is obvious that this bilinear form is symmetric and positive and, as i^{t) is continuous in t, it holds 
that the map t {G{u{t),t) 4>,ip) h" is continuous for all </>,'(/' G H°'{D). Furthermore, it is easy to 
see that the operator is bounded, i.e., 

l|G(!^(t),t)||L(H-,H-») = sup \\G{u{t),t)<j}\\H-'' = sup sup \{G{u{t),t) (l),'4>) < oo, 

Il0l|ff°=l ||0||«o=l ||^||ff==l 

as the solution of the Wilson-Cowan equation v and the gain function / are pointwise bounded. 
Hence due to the Cauchy-Schwarz inequality the norm \{G{i'{t),t) cj),Tp) \ is proportional to the 
product ||0||l2 IIV'IIl^ and for any a > the Sobolev Embedding Theorem gives now a uniform bound 
in terms of the norm of 4>, ip in H°'{D). As a final property we show that these operators are of 
trace-class if a > d/2. Thus let (v^jOjeN be an orthonormal basis in //"(D), then the Cauchy-Schwarz 
inequality yields 

1 . ..... ..^ 2 



\{G{iy{t),t)^j,^j)H^\ < -(l + ||/||o)|01||^, 



T 



L2 



Summing these inequalities for all j G N we find that the resulting right hand side is finite as due to 
Maurin's Theorem the embedding of H°'{D) into L^{D) is of Hilbert-Schmidt type. Moreover their 
trace is even bounded independently of t. 

Now, it holds that the map t h-> G{u{t),t) is continuous taking values in the Banach space of 
trace class operators, hence we define trace class operators G{t) from H°'{D) into H^^^D) via the 
Bochner integral for alH > 

C{t) := I G{u{s),s)ds. (2.18) 



Clearly, the resulting bilinear form {C{t)-,-) h" inherits the properties of the bilinear form (|2.17|) . 
Moreover, due to the positivity of the integrands it follows that {C{t)4>,(f)H'' is increasing in t for 
all (f> G H'^{D). Hence the family of operators C(t), t > 0, satisfies the above conditions (i)-(iii) and 
thus uniquely defines an _ff~"(_D)-valued diffusion process. 

We are now able to state the martingale central limit theorem. The proof of the theorem is 
deferred to Section [ 



Theorem 2.3 (Martingale central limit theorem) Let a > d and assume that the conditions of 



Theorem 12.11 are satisfied. In particular convergence in the mean holds, i.e., (|2.11|1 holds for r = 1. 
Additionally, we assume it holds that 

n-^co w+(n) t+{n) 

Then it follows that the sequence of rescaled H^°'{D) -valued martingales 



t>0 



converges weakly to the H "(D) -valued diffusion process defined by the covariance operator C{t) given by 
(IZTSll. 
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Remark 2.3 In connection with the results of Theorem 12.31 two questions may arise. First, in what 
sense is there uniqueness of the rescahng sequence and hence of the limiting diffusion? That is, does 
a different scaling also produce a (non-trivial) limit, or, rephrased, is the proposed scaling the correct 
one to look at? Secondly, the theorem deals with the norms for the range of a > d in the Hilbert scale, 
what can be said about convergence in the stronger norms corresponding to the range of a G [0,d]? 
Does there exist a limit? We conclude this section addressing these two issues. 

Regarding the first question, it is immediately obvious that the rescaling sequence f ~ j", , which 
we denote by pn in the following, is not a unique sequence yielding a non-trivial limit. Rescaling 
the martingales M" by any sequence of the form ^Jcpn yields a convergent martingale sequence. 
However, the limiting diffusion differs only in a covariance operator which is also rescaled by c 
and hence the limit is essentially the same process with either 'stretched' or 'shrinked' variability. 
However, the asymptotic behaviour of the rescaling sequences which allow for a non-trivial weak limit 
is unique. In general, by considering different rescaling sequences p*n we obtain three possibilities for 
the convergence of the sequence ^/p^M"" . If pn is of the same speed of convergence as p„, i.e., for 
Pn = 0{pn), the thus rescaled sequence converges again to a diffusion process for which the covariance 
operator is proportional to (|2.18p . This is then just a rescaling by a sequence (asymptotically) 
proportional to pn as discussed above. Secondly, if the convergence is slower, i.e., pn = o{pn), then 
the same methods as in the law of large numbers show that the sequence converges uniformly on 
compacts in probability to zero, hence also convergence in distribution to the degenerate zero process 
follows. Thus one only obtains the trivial limit. Finally, if we rescale by a sequence that diverges 
faster, i.e., pn = o{pn), we can show that there does not exist a limit. This follows from general 
necessary conditions for the preservation of weak limits under transformation which presuppose that 
\/Pn/pn M has to converge in distribution in order for \/p^ Mn possessing a limit in distribution, see 
[291 Thm. 2]. As the sequence Pn/Pn diverges, this is clearly not possible to hold. 

Unfortunately an answer to the second question is not possible in this clarity, when considering 
non-trivial limits. Essentially, we can only say that the currently used methods do not allow for any 
conclusion on convergence. The limitations are the following: The central problem is that for the 
parameter range a € [0,d] the current method does not provide tightness of the rescaled martingale 
sequence, hence we cannot infer that the sequence possesses a convergent subsequence. However, if 
tightness can be established in a different way then for the range a G (max{l, d/2}, d] the limit has 
to be the diffusion process defined by the operator (|2.f 8|1 as follows from the characterisation of any 
limit in the proof of the theorem. Here, the lower bound of max{l,d/2} results, on the one hand, 
from our estimation technique which necessitates a > 1 and, on the other hand, from the definition 
of the limiting diffusion. Recall that the covariance operator is only of trace class for a > d/2. Hence 
for a € [0, d/2] we can no longer infer that the limiting diffusion even exists. 



2.3 The mean-field Langevin equation 

An important property of the limiting diffusion in view towards analytic and numerical studies is 
that it can be represented by a stochastic integral with respect to a cylindrical or Q-Wiener process. 
For a general discussion of infinite-dimensional stochastic integrals we refer to [I2_. First, let {Wt)t>o 
be a cylindrical Wiener process on H~°'{D) with covariance operator being the identity. Then, 
G{u{t),t) o is a trace class operator on H^°'{D) for suitable values of a. Here : H~°'{D) — > 
H'^{D) is the Riesz Representation, i.e., the usual identification of a Hilbert space with its dual. The 
operator G{u{t),t) o possesses a unique square-root we denote by ^jG{v{t),t) o t~i which is a 
Hilbert-Schmidt operator on H~°'[D). It follows that the stochastic integral process 



is a diffusion process in H "(D) with covariance operator C{t). That is, (^t)f>o is a version of the 
limiting diffusion in Theorem l2.3l Now, formally substituting for the limits in (|2.f 5p yields the linear 





(2.20) 
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noise approximation 

Ut = VQ+ ( {Us + F{Us,s)) ds + en f \/G{u{s),s) o dWs , 

Jo Jo 

or in differential notation 

dUt = T-\Ut +F{Ut, t)) dt + en ^/G{iy{t),t) o t-i dWt , Uo=uo, (2.21) 

where en = vj^ (n) /l^ (n) is small for large n. Here we have used the operator notation 

F ■ H-^iD) X K+ ^ H-^iD) : F{g,t){x) = /((<?, + /(t, a;)) . 

Equation f^^TT^ is an infinite-dimensional stochastic differential equation with additive (linear) noise. 
Here additive means that the coefHcient in the diffusion term does not depend on the solution Ut ■ A 
second formal substitution yields the Langevin approximation. Here the dependence of the diffusion 
coefficient on the deterministic limit u is formally substituted by a dependence on the solution. That 
is, we obtain a stochastic partial differential equation with multiplicative noise given by 

Vt = Vo+ f {Vs + f (Vs, s)) ds + en I ^/G{Vs,s)oL-^dWs , 

Jo Jo 

or in differential notation 

dVt = r-\Vt + F{Vt,t)) dt + en \/G{Vt, t) o t"! dWt . (2.22) 

Note that the derivation of the above equations was only formal, hence we have to address the 
existence and uniqueness of solutions and the proper setting for these equations. This is left for 
future work. Furthermore, it is an ongoing discussion and probably undecidable as lacking a criterion 
of approximation quality which - if any at all - is the correct diffusion approximation to use. First 
of all note that for both versions the noise term vanishes for n — >• oo and thus both have the Wilson- 
Cowan equation as their limit. And also, neither of them approximates even the first moment of the 
microscopic models exactly. This means that for neither we have that the mean solves the Wilson- 
Cowan equation which would be only the case if / were linear. However, they are close to the mean 
of the discrete process. We discuss this aspect in the Appendix IbI 

Furthermore, we already observe in the central limit theorem and thus also in the linear noise 
and Langevin approximation that the covariance (|2.18p or the drift and the structure of the diffusion 
terms in (|2.2ip and (|2.22p . respectively, are independent of objects resulting from the microscopic 
models. They are defined purely in terms of the macroscopic limit. This observation supports the 
conjecture that these approximations are independent from possible different microscopic models 
converging to the same deterministic limit. Analogous statements hold also for derivations from the 
van Kampen system size expansion 5 and in related limit theorems for reaction diffusion models [IJ 
1171118] . The only object reminiscent of the microscopic models in the continuous approximations is 
the rescaling sequence en- However, the rescaling is proportional to the square root of £-(n) /v^{n) , 
i.e., the number of neurons per area divided by the size of the area, which is just the local density 
of particles. Therefore, in the approximations the noise scales inversely to the square root of neuron 
density in this model, which, interpreted in this way, can also be considered a macroscopic fixed 
parameter and chosen independently of the approximating sequence. 

Remark 2.4 The stochastic partial differential equations p. 211) and (|2.22p which we proposed as the 
linear noise or Langevin approximation, respectively, are not necessarily unique as the representation 
of the fimiting diffusion as a stochastic integral process (|2.20p may not be unique, ft will be subject 
for further research efforts to analyse the practical implications and usability of this Langevin ap- 
proximation. Let Q be a trace class operator, (W^)t>o be a Q- Wiener process and let B(y{t),t) be 
operators such that B{u{t) ,t) o Q o B{u{t) ,t)* = G{u{t),t)o , where * denotes the adjoint operator. 
Then also the stochastic integral process 

Zf := f B{u{s),s)dw9 
Jo 
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is a version of the limiting diffusion in (|2.3p and the corresponding linear noise and Langevin ap- 
proximations are given by 

dU^ = T-\U^ + F{uf,t)) dt + tn B{u{t),t) dW^ 

and 

dVt^ = r-^ [Vt^ + F{V^,t)) dt + e„ B{V^,t) dW^ . 

We conclude this section by presenting one particular choice of a diffusion coefficient and a Wiener 
process. We take {W^)t>Q to be a cylindrical Wiener process on L^{D) with covariance Q = ld]^2. 
Then we can chose B{t) = j o {-^/gif)) G L{L'^{D),H^"{D)), where j is the embedding operator 
L^(-D) H~°'{D) in the sense of (fO)) and [-^fglt)) G L{L^ [D) , [D)) denotes a pointwise product 
of a function in L^{D), i.e., 

■ v/ff(t))(a;) = <l>{x) {T-^v{t, a;) + r" V v) v{t, y) dy + I{t, x)) ) . 

We first investigate the operator G{u{t),t)oL^^ and write it in more detail as the following composition 
of operators 

G{v{t),t) o = j o {-git)) oko , 

where k is the embedding operator //"(D) ^ L^iD). Next the Hilbert adjoint B* G L{H~°',L^) 
is given by B* = {-^/g) o k o which is easy to verify. Hence the stochastic integral of B{t) with 
respect to is again a version of the limiting martingale as 

B{t) oQoB*{t)=jo (-VffW) °IdL2 o {■^/g(t)) okoiT^ =jo {■g{t)) okoT^ = G{u{t),t) o . 



3 Discussion and extensions 

In this article we have presented limit theorems that connect finite, discrete microscopic models 
of neural activity to the Wilson-Cowan neural field equation. The results state qualitative connec- 
tions between the models formulated as precise probabilistic convergence concepts. Thus the results 
strengthen the connection derived in a heuristic way from the van Kampen system size expansion. 

A general limitation of mathematically precise approaches to approximations, cf. also the prop- 
agation to chaos limit theorems in [30 , is that the microscopic models are usually defined via the 
limit. In other words, the limit has to be known a-priori and we look for models which converge 
to this limit. Thus, in contrast to the van Kampen system size expansion the presented results are 
not a step- by-step modelling procedure in the sense that, via a constructive limiting procedure, a 
microscopic model yields a deterministic or stochastic approximation. Hence, it might be objected 
that the presented method can only be used a-posteriori in order to justify a macroscopic model from 
a constructed microscopic model and that somehow one has to 'guess' the correct limit in advance. 
Several remarks can be made to answer this objection. 

First, this observation is certainly true, but not necessarily a drawback. On the contrary; when 
both microscopic and macroscopic models are available, then it is rather important to know how 
these are connected and qualitatively and quantitatively characterise this connection. Concerning 
neural field models, this precise connection was simply not available so far for the well-established 
Wilson-Cowan model. Furthermore, when starting from a stochastic microscopic description working 
through proving the conditions for convergence for given microscopic models one obtains very strong 
hints on the structure of a possible deterministic limit. Therefore our results can also ease the 
procedure of 'guessing the correct limit'. 

Secondly, often a phenomenological, deterministic model which is an approximation to an inher- 
ently probabilistic process is derived from ad-hoc heuristic arguments. Given that the model has 
proved useful, one often aims to derive a justification from first principles and / or a stochastic 
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version which keeps the features of the deterministic model but also accounts for the formerly ne- 
glected fluctuations. A standard, though somewhat simple approach to obtain stochastic versions 
consists of adding (small) noise to the deterministic equations. This article, provides a second ap- 
proach which consists of finding microscopic models, which converge to the deterministic limit to 
obtain a stochastic correction via a central limit argument. 

Thirdly and finally, the method also provides an argument for new equations, i.e., the Langevin 
and linear noise approximations, which can be used to study the stochastic fluctuations in the model. 
Furthermore, in contrast to previous studies we do not provide deterministic moment equations but 
stochastic processes, which can be, e.g., via Monte Carlo simulations, studied concerning a large 
number of pathwise properties and dynamics beyond flrst and second moments. 

We now conclude this article commenting on the feasibility of our approach connecting micro- 
scopic Markov models to deterministic macroscopic equations when dealing with different master 
equation formulations that appear in the literature. Additionally, the following discussions also re- 
late the model (|1.6p considered in this article to other master equation formulations. We conjecture 
that the analogous results as presented for the Wilson-Cowan equation (|1.3|) in Section [2] also hold 
for these variations of the master equations. This should be possible to achieve by an adaptation of 
the methods of proof presented although we have not performed the computations in detail. 



3.1 A Variation of the master equation formulation 

A first variation of the discrete model we discussed in Section 11.21 was considered in the articles [HI 
[9] and a version restricted to a bounded state space also appears in |31] . This model consists of the 
master equation stated below in (|3.2p which closely resembles (|1.6|) . In the earlier reference [8J the 
model was introduced witli a different interpretation called the effective spike model. We briefiy explain 
this interpretation before presenting the master equation. Instead of interpreting P as the number of 
neuron populations, in this model P denotes the number of different neurons in the network located 
within a spatial domain D. Then 0^, the state of the fcth neuron, counts the number of 'effective' 
spikes this neuron has emitted in the past up till time t. Effective spikes are those spikes that still 
infiuence the dynamics of the system, e.g., via a post-synaptic potential. Then state transitions 
adding / subtracting one effective spike for the fcth neuron are governed by a firing rate function /j, , 
which depends on the input into neuron k, and a decay rate . The constant decay rate indicates 
that emitted spikes are effective for a time interval of length r and the gain function is defined - 
neglecting external input - by 

p 

where /* is a certain nonnegative, real function. It is stated clearly in [9J that the function /* is 
not equal to the gain function / in the proposed limiting Wilson-Cowan equation (|1.3|) but rather 
connected to / such that 

p P 

E/* WkjO{^ = WkjEOl^ + higher order terms . (3.1) 

The authors in 5^ state that for any function / such a function /* can be found. Then the process 
0t = {Oj, ■ ■ ■ , Gif) is a jump Markov process with its evolution governed by the master equation 

P 

= E [/^(^ - ^fc) -ek,t]-{^ + Jm) ne, t] + \ [e^ + 1) P[e + e^, t]] (3.2) 
fc=i 

with boundary conditions P[S,t] = if ^ as stated in [9]. The advantage of the effective spike 
model interpretation over the interpretation as neurons per population is that the unbounded state 
space of the model is justified. In principle there can be an arbitrary number of spikes emitted in the 
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past still active. However, a disadvantage of the master equation (|3.2p is that for taking the limit it 
lacks a parameter corresponding to the system size providing a natural small parameter in the van 
Kampen system size expansion. This explains the shift in the interpretation of the master equation 
in the study 9 following and subsequently in |5j to the interpretation we presented in Section 
11.21 which provides the system-size parameters l{k). 

On the level of Markov jump processes the master equation (|3.2p obviously describes dynamics 
similar to the master equation (|1.6p only replacing the activation rate T^^l{k)ff.{9) in (|1.6p by /fc(S) 
which is independent of the parameter l{k). Thus, the model (|3.2p can be understood as resulting 
from (|1.6p after a limit procedure taking l{k) — > oo has_been applied and the firing rate functions are 
connected via the formal limit lim/(j,)_j.oc l{k)fj^(6) = fk{d)- A qualitative interpretation of this limit 
procedure connecting the two types of models is given in [8] . This observation motivated the model 
in [5] stepping back one limit procedure and thus providing the correct framework for the derivation 
of limit theorems. 

It would be an interesting addition to the limit theorems in Theorem l2.1l to derive a law of large 
numbers for the models (|3.2p with stochastic mean activity i^" as defined in (j2.7p and suitable chosen 
weights Wkj- Clearly, the macroscopic limit should be given by the Wilson-Cowan equation (|1.3|) . 
We conjecture that the appropriate condition for the function /* in the present setting - including 
time dependent inputs - is 

p P 
E[/(fc,n)-V*(Ew^fcje^+4,nW)l =/(EM7fcj#%+^fe,nW) + h.o.t., (3.3) 

such that the higher order terms are uniformly bounded and vanish in the limit n oo, and where 
the weights Wkj and inputs Ik,nii) are defined as in (|2.4p and (|2.6p . Property (|3.3p closely resembles 
condition (|3.ip and trivially holds for linear / with /* = /. 



3.2 Bounded state space master equations 

We have already stated when introducing the microscopic model in Section ll.2l that the interpretation 
of the parameter l{k) as the number of neurons in the fc-th population is not literally correct. The 
state space of the process is unbounded, hence arbitrarily many neurons can be active and thus 
each population contains arbitrarily many neurons. In order to overcome this interpretation problem 
it was supposed to consider the master equation only on a bounded state space. That is, the fc-th 
population consists of l{k) neurons and < < l{k) almost surely. Such master equations are 
simply obtained by setting the transition rates for transition of S'" from l{k) — >• l{k) -|- 1 to zero. 

A first master equation of this form was considered in [22] which, in present notation, takes the 
form 

P 

fc=i 

(3.4) 

Versions of such a master equation for, e.g., one population only or coupled inhibitory and excitatory 
populations were considered in [3ll22] and a van Kampen systems size expansion was carried out. Here 
the bound in the state space provides a natural parameter for the rescaling, thus a small parameter 
for the expansion. The setup of this problem resembles closely the structure of excitable membranes 
for which limits have been obtained with the present technique by one of the present author and 
co-workers in [27]. Therefore we conjecture that our limit theorems also apply to this setting with 
minor adaptations with essentially the same conditions and results as in Section [2l However, the 
macroscopic limit which will be obtained does not conform with the Wilson-Cowan equation but will 
be given by 

r u{t, x) = -u{t, x) + {l- u{t, ^))f{j^ ^(^. y>{i^ y) dj/ + ^(i, a^)) ■ (3.5) 
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Next, we return to the master equation (|1.6p as discussed in this article in Section [1.21 and the 
comment we made regarding bounded state spaces the footnote on page [S] In our primary reference 
for this model [S' actually a bounded state space version of the master equation was considered where 
the activation rate for the event 6*^ -^6^ + 1 is 

Ufc)7fe(e,t)I[e,<i(fc)], (3.6) 

replacing l(k)ff.{6,t) in (|1.6p . The van Kampen system size expansion was then applied to this 
bounded state space master equation, tacitly neglecting possible difficulties which might arise due to 
the discontinuity of (|3.6p considered as a function on R^. However, for the present, mathematically 
precise limit convergence results considering bounded state space as originally suggested in [5] are 
problematic. The discontinuous activation rate (|3.6p causes the machinery developed in [23 which 
depends on Lipschitz-type estimates to break down. However, we strongly expect that also in this 
case the law of large numbers with the deterministic limit given by the Wilson-Cowan equation (|1.3p 
holds. Furthermore, also the Langevin approximations should agree with the equations discussed in 
Section [2. 31 However, we have not yet been able to prove such a theorem. We further conjecture that 
the results in this article can be used to prove the convergence for the bounded state space model by 
a domination argument. Heuristically, it seems clear that a bounded process should be dominated 
by a process that possesses the same dynamics inside the state space of the bounded process but 
can stray out from that bounded domain. Hence, as the limit of the potentially larger process lies 
within the domain where the two processes agree also the dominated process should converge to the 
same limit. Mathematically, this line of argument relies on non-trivial estimates between occupation 
measures of high-dimensional Markov processes. This is work in progress. 



3.3 Activity based neural field model 

Finally, we return also to a difference in neural field theory mentioned in the beginning. In contrast 
to rate-based neural field models of the Wilson-Cowan type (|l.ip there exists a second essential class 
of neural field models, so called activity based models, the prototype of which is the Amari equation 

Tu{t,x) = ~v{t,x) + I ■w{x,y)f(i'{t,y))dy + I{t,x). (3.7) 
■Jd 

We conjecture that also for this type of equations a phenomenological microscopic model can be 
constructed with a suitable adaptation of the activation rates and that limit theorems analogous to 
the results in Section 12.11 hold. Then also a Langevin equation for this model can be obtained and 
used for further analysis. 



4 Proofs of the main results 

In this section we present the proofs of the limit theorems. For the convenience of the reader, as it 
is important tool in the subsequent proofs, we first state the Poincare inequality. Let D C M'* be a 
convex domain then it holds for any function (j> G H^{D) that 

Ud - ^LHO) < ^^^^^ \\\7<t)\\LHD) , (4.1) 
where (I)d the mean value of the function (f) on the domain D, i.e., 




Moreover, the constant in the right hand side of (|4.ip is the optimal constant depending only on the 
diameter of the domain D, cf. [ni23|. Whenever we omit to denote the spatial domain for definition 
of norms or inner products in L^{D) or Sobolev spaces H'^{D) then it is to be interpreted as the 
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norm over the whole domain D. If the norm is taken only over a subset D^. „ then this is always 
indicated unexceptionally. 

For the benefit of the reader we next repeat the limiting equation 

r 0{t, x) = -v{t, x) + /(y^ yy{t, y) dj/ + I{t, a;)) . (4.3) 
We denote by F the Nemytzkii operator on L^{D) defined by 

F{g,t){x) = f(^l^w{x,y)g{y)dy + I{t,x)) ^ g £ L\D) , (4.4) 
and for all 6 e we define a discrete version of the Nemyztkii operator via 

-^u'\e) + ^r\v^{e),t) = \'\e,t) [ {u"{0 -u"{9)) i^"{{0,t),d^) 

p 

k=l 

p 

= _i^«(e) + i^7^je,t)V„. (4.5) 

fe=i 

Note that r~^(0, i'"(6'))i2 + T^^{cj>,'F'\iy^\0),t))i^2 for G i^(-D) corresponds to the generator of 
{9t,t)t>o applied to the function {e,t) (<;/>, !/"(6i))i2. 

Finally, another useful property is that the means of the process' components are bounded. For 
each fc, n it holds that 

Eef'" = E©J'" + i f l{k,n)Ejk^^{Yr) -E&s'" ds < E0^''" + i /* ;(fc, n) ||/||o - E0^" ds, 
'''Jo ' '''Jo 

see also (|B.ip . Therefore it holds that Esf'" < rn^'^, where mj''" solves the deterministic initial 
value problem 

mj' =--m(' + -l(k,n)\\f\\o, niQ = Ee^' , 

i.e., 

m'^." =e-'/^{ml^~l{k,n)\\f\\o)+l{k,n)\\f\\o < l{k,n){l + \\f\\o) Vt>0. (4.6) 
Here we also used the assumption E"6'q'" < l{k,n) on the initial condition. 



4.1 Proof of Theorem 12. II (Law of large numbers) 

In order to prove the law of large numbers, Theorem 12.11 we apply the law of large numbers for 
Hilbert space valued PDMPs, see [13 Thm. 4.1], to the sequence of homogeneous PDMPs ('K")t>o = 
(0",t)t>o. For the application of this theorem, recall that the first, piecewise constant, vector-valued 
component of this process counts the number of active neurons in each sub-population and the second, 
deterministic component states time. The process (yf")t>o is the usual 'space-time process', i.e., 
homogeneous Markov process which is obtained via a state-space extension to obtain a homogeneous 
Markov process from the inhomogeneous process {0t^)t>o- The continuous component satisfies the 
simple ODE i = I, t(0) = and thus the full process is a PDMP. In the terminology of |2Z| the 
sequence of coordinate functions on the different state spaces of the PDMPs (yt")t>o into a common 
Hilbert space is given by the maps u"' (|2.7p with the common Hilbert space L^{D). Thus in order 
to infer convergence in probability (|2.10p from [27, Thm. 4.1] it is sufficient to validate the following 
conditions: 
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(LLNl) For fixed T > it holds that 

I- r \'\Yr) I ||/."(0-^"(er)lli.(D)M"(n",dOdt = 0. (4.7) 



lim E'^ 



(LLN2) The Nemytzkn operator F satisfies a Lipschitz condition in L^{D) uniformly with respect 
to t, t > 0, i.e., there exists a constant io > such that 

\\F{gi,t)-F{g2,mL^- < Lo\\gi-92\\L^- Vt > 0, pi , 52 G ^^(D) . (4.8) 

(LLN3) For fixed T > it holds that 

lim E" / \\T\i^l\t)- F{iyl\t)\\^,dt = 0. (4.9) 
Jo 

Note that the final condition of [27', Thm. 4.1], i.e., the convergence of the initial conditions, is 
satisfied by assumption. For a discussion of these conditions we refer to and proceed to their 
derivation for the present model in the subsequent parts (a) to (c). 

(a) In order to prove condition (|4.7p we write the integral with respect to the discrete probability 
measure /i" as a sum. This yields 



E"A(Fr) / ii^"(o-^"(eniii^(D)M"(yt",do 



lj2^"j(r;A2{^t'" +Kk,n) f,^^{Yn) \D,^n\ (4.10) 



l{k,n) 

< ll + 2|l/ll0|^| 



where we have used the upper bound (|4.6p on the expectation E"©^'" and the assumption on the 
initial conditions. Next, integrating over [0,r] and employing the assumption limn^oo i-{n) = 00 in 
(|2.9p establishes condition (|4.7p . 



(b) The Lipschitz condition (|4.8p of the Nemytzkii operators is a straightforward consequence 
of the Lipschitz continuity (|1.4p of the gain function / as 



\\F{gi,t) - F{g2,t)\\l, = \f(^j^ w{x, y)gi{y)dy + I{x,t)) ^(^: y)92{y)dy + I{x, i)) 

2 



|2 

1 ds 

Id ~ ~" ^1 

< I I / w{x,y){gi{y)-)g2{y))dy\ dx 
JD I JD ' 

< I ||u>(a::,-)||i2 llffi - 32||i2 da; 
■Id 

= Iklli^xL^ hi -92\\l2 ■ 
Therefore (|4.8p holds with Lipschitz constant Lq := L \\w\\ 1^2 1^2 . 

(c) Finally we prove the convergence of the generators (|4.9p . To this end we employ the charac- 
terisation of the norm in L^{D) by |jr;||i2 = supn^n^^^j^ \{4>,''i)l2 \ for all r/ e L^{D) and thus consider 
first the scalar product of elements (f) G L^{D) with ||(^||l2 = 1 and the difference inside the norm in 
(|4.9p . On the one hand we obtain using definition (|4.5p that 



,i?"(z.r,t))^2 = (<^,f]7fc,„(Ft")ii5,,„)^^. (4.11) 



fc=i 
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Next we apply the Nemytzkii operator F defined in (|4.4p to v^(t) and take tiie inner product of tfie 
result with respect to to obtain on the other hand 

{ch,F{ur,t))^, = [c^j{y2j^ w{;y)dy + I{t,-)). (4.12) 



k=l 



Subtracting (14.12^ from (|4.1ip we obtain the integrated difference 



L2 



e"(0 



P 



w{x,y) dy + I{t,x) 



dx 



{t,x) 



dx . 



We proceed to estimate the norm of the term in the right hand side. We use the Lipschitz condition 
(|1.4p on /, the triangle inequality and finally the Cauchy-Schwarz inequality on the resulting second 
term to obtain the estimate 



^ r 



IE 



/ -w{x,y)dy)\dx+LY] Uh^Dk Pfc,n(i) - -^(*)llL=(Dfc „) 



(*) 



(**) 



Here, the term in the right hand side marked (**) is further estimated using the Cauchy-Schwarz 
inequality and the Poincare inequality (|4.ip which yields 



M < ^(Elivn 



1/2 _ S+{n) 



L2 . 



(4.13) 



We now consider the term marked (*). Inserting the definition of WXj given in (|2.4|) . the reordering 
of the summations and changing the order of integration yields 



w = E 

k=l 
P 

^E 



i0(a 



l{k,n) 



k,n\ Jd,, 



1^ 



k,n\ J D 



w[z, y) dz^ — w(x, y) dy 
w^z, y) dz^ — w{x, y) j dy dx 



dx 



E E ul \ / / (t7^ r/ ■w{z,y)dz)-w{x,y)\dx 



k=lj=l 



dy. 



We next apply the Cauchy-Schwarz inequality to the integral inside the square brackets in the last 
term. Thus we obtain the estimate 



p p ^ . 

(*) < E / E ") / (fn 1/ w{z,y)dz) ~w{x,y)\ dx 



1/2 



dy. 
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Now the Poincare inequality (|4.ip is applied to the innermost integral inside the square brackets 
which yields 

Finally, using once more the Cauchy-Schwarz inequality on the innermost summation we obtain 

M<^E^/^^J|V.<,.)|l..d,. (4.14) 
Now, a combination of the estimates (|4.13p and (|4.14p on the terms (*) and (**) yields 

|(<^,F"(^r,t))i.-(<^,n^"(t),t))i.| < WM;y)\\L^dy+\\V,imL^'^ ■ 

Here the right hand side is independent of (f>, hence taking the supremum over all (f> with 11011^2 = 1 
yields 

||F"(^r,i)-n^"(t),t)lL. < SAn)^(j2j^ l|V.u;(-,y)||i.dy+|lV./(t)||i. 
Finally, integrating over (0, T) and taking the expectation on both sides results in 



E' 



"y P"(j^r,i) - F{u^{t),t)\\^,dt < S+{n)^(^^\T (1 + ll/llo) WV.wWl^^l^ + llV./|lii((o,T).L=)) • 

(4.15) 

Here we have used (|4.6p and a combination of the Cauchy-Schwarz and Poincare inequality (|4.ip in 
order to estimate 



^"Eyi^ / llvW-,y)llL^dy < ^+(n) ^'^^ (1 + 11/11°) iiv.hIl^xl^. 



The upper bound in (|4.15p is of order 0{S+{n)) and therefore converges to zero for n —> oo due to 
assumption (|2.9|) . Hence, condition (|4.9p is satisfied as convergence in the mean implies convergence 
in probability. The proof of the convergence in probility (|2.10p is completed. 

It is now easy to extend this result to the convergence in the r-th mean. First of all the convergence 
in probility (|2.10p implies for all r > 1 the convergence in probability of the random variables 
supjgjQ 111/" — v{t)\\^2 to zero. As convergence in the mean of real valued random variables is 
equivalent to convergence in probability and uniform integrability it remains to prove the latter for 
the families supj^jp ^^j — u{t)\\''j^2, n G N. 

We first consider the case r = 1, and establish a uniform bound on the second moments 
E" suptg[Q j.] \\ut - v[t)\\\2- Then the de la Vallee-Poussin Theorem, cf. [151 App., Prop. 2.2], im- 
plies that the random variables supjgjQ \\vi — v{t)\\]^2, n G N, are uniformly integrable. 

Without loss of generality we can assume that there exisl[f| Poisson processes {N^'")t>o with 
rates Af^ ^ = l{k,n)(l + ||/|lo)/''", which dominate (0^'" — 6'Q'")t>o pathwise. Then we obtain almost 



^ The Poisson process jumps at a faster rate than the components of the Markov chain regardless of the time and 
the state these are in. Furthermore all jumps are upwards. Hence using a coupling argument as discussed in the 
proof of |16l Thm. 4.3.5] we find that there exists a probability space supporting two processes with distributions 
equivalent to the Poisson process and the Markov chain component such that the Poisson process dominates the 
second process for all paths. Clearly, all moments dominate and this inequalities are then valid for any probability 
spaces supporting these processes. 
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surely 

Here the right hand side is independent of t < T and thus we obtain 

where we have used that A^^'" is Poisson distributed with rate TAf^ n and thus E"(A''^'")^ = TAf^ n + 
T^/l^ Here Ct is some finite constant which depends on T and the overall parameters of the model, 
i.e., r, /, D, but is independent of k and n. Using this upper bound the triangle inequality yields the 
estimate 

E"suptg[p.r] Wi^t - v[t)\\\2 < 2CT(^''\Wo\\h + \W\\c{Io,t],l^) + l) ■ 
Therefore using the assumption sup„gf^E"||i/5'||^2 < oo it holds that 

SUpE"suptgrQ J.] \\l^t - I^(t)||i2 < oo. 

nSN 

The general case for r > 1 works analogously. Note that the r-th moment of the Poisson dis- 
tribution is proportional to the r-th power of its rate. Hence, just as in the case of r = 1, the 
term 

can thus be bounded from above by some constant Ct independent of k and n. The proof of Theorem 
12. H is completed. 



4.2 Proof of Corollarv l2.1l (Corollary to the law of large numbers) 

For a = the statement of the corollary coincides with the statement of Theorem 12.11 hence we 
consider a > 0. As in the proof of Theorem [231 we apply 27, Thm. 4.1] to the PDMPs (Ft")t>o 
however this time for the functions i/" understood as taking values in the Hilbert space H"" instead 
of L^. Thus we have to validate again conditions (LLN1)-(LLN3) wherein the norm in is always 
replaced by the norm in H~°'. The essential argument is sharpening the estimates in part (a) of the 
proof of Theorem 12.11 using optimal Sobolev Embedding Theorems such that the conditions (|2.13p 
imply (LLNl). This we present in part (a) of the proof below. The Lipschitz condition (LLN2) of 
the Nemytzkii operator F in the spaces H^" is established in part (b). Finally, as the condition 
(5-1- (n) — >• oo remains as in Theorem 12. 11 the condition (LLN3) follows immediately from the proof of 
Theorem 12. II due to the continuous embedding of into 

(a) In the case a = 0, i.e., = LF' , we used in (|4.1Up that W^Dk nWt^ ~ l^k.nl- For general 
a > we use the representation 

P-Dfc,„llH-»= sup \{<I>,Id^_Jl^\. 

I0IIh= 

In order to estimate the terms inside the supremum in the right hand side we use Holder's inequality 
and the Sobolev embedding theorem, i.e., ^ L°°{D) for a > d/2 and ff"(L>) ^ L'' {D) with 

r = d/[d/2 - q) for < a < d/2, see [21 Thm. 7.34, Corol. 7.17]. Thus we obtain 

f ^<i/(d/2-a) Pu. „ II if < Q < d/2, 

Pd..JIh-. < 

[ Koo\\Id,J\l^ if d/2 < a. 



Limit theorems for stochastic neural field equations 



25 



where the constants K are the constants arising from the continuous embeddings of the Sobolev 
spaces into the Lebesgue spaces. Evaluating the norms in the right hand side and further estimating 
using the maximal Lebesgue measure of the elements of the partition yields 



^d/(d/2-a) \Dk.n\ v+{nf^/'' if < a < d/2, 
Klo\Dkji\vj^{n) if d/2 < a, 



Note that the upper bounds are consistent with the condition in Theorem 12. II for q = 0. Finally, 
as W^^^ for all small e, the result for a = d/2 follows from the result above as 

\\Id, „ II H-^/- < sup (ll<^ll L^l' ¥d, J^^) < C\\Id,J. 

where C is the constant resulting from the continuous embedding of H'^^^{D) into H'^^^^'^. Thus we 
obtain for all e > the estimate 

Pd,,JIh-./2 < C^\Dk^^\v+{n)^ . 

(b) Next we have to establish that the Nemytzkii operator F on (D) is also Lipschitz contin- 
uous with respect to the norms || ■ ||^f-Q, a > 0, i.e., for all a? > there exists a constant L-a such 
that 

\\F{gi,t)-F{g2,t)\\H-o. < L-a\\9i-g2\\H-'' V t > 0, gi, g2 € L^D) . (4.16) 
We obtain due to the Lipschitz continuity of /, which implies absolute continuity of /, that 



x){F{gi,t){x) - F{g2,t){x)]dx 







where 



D 



zi{t,x)= / w{x,y)gi{y)dy + I{t,x), Z2{t,x) 
Jd 



2{t,x) 



f'{z) dzda; 



z-i (t,x) 



w{x,y)g2{y) dy + I{t,x). 



Applying Holder's inequality and the essential boundedness of the derivative /' we obtain the estimate 
{x)(^F{gi,t){x) ~ F{g2,t){x)^ dx^ < \\, 



WLp 



Jd 



Z2{t,x) q \ 1/g 

/ (z) dz dx 

D J zx(t,x) 



< 



LP i^j \\f'\\L-° {zi{t,x) - Z2{t,x))\^ dx^ 



1/-3 



= ll<^lkHI/'llLo 

Next, as by assumption w{x, •) e we obtain 



D 



D 



■w{x,y)[gi{y) - gi{y)) dy 



dx 



1/-3 



D 



w{x,y)(gi{y) - gi{y)) dy 



9x1/9 






dxj =( 











1/9 



< \\-w\\lixH'= llffl - 92\\h-'' ■ 

Overall this yields the estimate 

|(<?!-,F(gi,t) - F(ff2,0)/f--» I < W\lp II/'IIl~ I|u'IIl9xh° llffi - g2\\H-« ■ 

Hence taking the supremum on both sides of this inequality over all Hi^Hi/a = 1 we obtain the 
Lipschitz condition (|4.16p with L_q := L Ka \\w\\lixH° where Ka is the constant resulting from the 
continuous embedding of H" into and the Lipschitz constant L of / satisfies L > ||/'||l~. 
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4.3 Proof of Theorem 12.21 (Infinite time convergence) 

(a) We first present an alternative representation for the jump processes {0t')f>o s-^d the solution 
V of the Wilson-Cowan equation (|1.3p . Using the generator of the PDMP (G'",t)t>o we obtain that 
the components Q*^'" satisfy 

Jo ./np 

= ©0^^" + + iKfc-n)7fe,„(©?-^)) d. + M^", (4.17) 

where {M^''^)t-^Q is a square-integrable cadlag martingale given by 

M^" := ef - ©S'" - /*A"(©^,s) / (^'=-©^")M"(©^,s;d^)ds. (4.18) 

As the jump process is regular this martingale is almost surely of finite variation and it could also 
be written in terms of a stochastic integral with respect to the associated martingale measure of the 
PDMP [16_. Next, considering 0*^'" the solution of the stochastic differential equation (j4.17p driven 
by the martingale M*^'" - it is clear that a solution exists as the stochastic integral equation (|4.17p 
is constructed from a solution - it follows from the variation of constants formula that it satisfies 

©f'" =e-*/"©^''" + i/(fc,n) /*e-(*-^)/"7;,_„(©^,s)ds+ fe~^'~''>/^dMt" . (4.19) 
'''Jo ' Jo 

This formula can also be easily verified path-by-path by inserting (|4.19p into (|4.17p and using in- 
tegration by parts. Note that here the stochastic integral with respect to the martingale is just a 
Riemann-Stieltjes integral as the martingale is of finite variation. For the sake of completeness we 
briefiy sketch the arguments. Thus, inserting (|4.19p into (|4.17p yields 




■• — ^ ' 

Considering the three terms marked (*) -(***) separately, we show that this right hand side equals 
(|4.19p . For the first term (*) simply evaluating the integral yields 



©S'" - - f e-^/"©J'"ds = ©S" - i(Q-ie-*/" - r)©J'" = e-*/"©J 
Jo* ''" 
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which gives the first term in the right hand side of (|4.19p . Next we simphfy the term (**) employing 
integration by parts to the first term in (**) which yields 



1 



i /*e(*-^)/- / e-^^-'-^/^/.^je^.Odrd. 
lo Jo 

t 



= - re-(*-^)/"7fe,„(eI\s)d.+ /*7fc,„(e^,.)ds. 

Jo Jo 
Thus we obtain subtracting from this right hand side the second term in (**) that 

Jo 

This term is just the second term in the right hand side of (|4.19p . It remains to consider the term 
marked (***). We have already stated that the stochastic integral with respect to the martingale 
(|4.18p is defined path-by-path as a Riemann-Stieltjes integral and thus satisfies 

= -- y e-^^-'i'^/'re^f ~©^f) + i re-(^-'')/"A"((9:?,r) f ~ <9^") ((©", 0. dO ds , 

^^t^, ' ' ''Jo JK 

where r" denotes the j-th jump time of the n-th PDMP. Integrating the sum in this right hand side 
over {Q,t) yields 

3 — J — J — 

= e-(*-^^")/^(e^f - e^f ) - (ef^" - e^'") . 

Next, we apply integration by parts to the integral over (0,t) of the second term above analogously 
to the application to term (**) and obtain 



1 



t PS 

Jo 



e-"("-'"U"(e;?,r) / (c'^-e^") r),dOdrds 



Jo Jn^ 

Jn^ 



Hence, overall these considerations show that 



(***)= / e"(*-'')/^dM, 
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and we obtain the final, third term in the right hand side of (|4.19p . This completes the proof that 
(|4.19p solves the equation (|4.17p . 

Further, we obtain from the variation of constants formula for 0^'" also a representation for the 
stochastic mean activity by inserting (|4.19p into its definition (|2.7p . This gives 

= e-*/^uS + - f e-(*-^'/^ s) ds + V -1- /* e-(*-^'/^dAf^'" I,,, „ . (4.21) 

''" Jo j.^-^ H'^i "-7 Jo 



Finally, in order to compare stochastic and deterministic solutions we use that the solution of 
the Wilson-Cowan equation can also be given via the variation of constants formula, i.e., it holds 
that for alH > 



{t) = e-^'^v{Q) + - I e-^^-"^''' F{u{s),s)As. (4.22) 



Thus, subtracting (|4.22p from (|4.2ip and taking the expectation of the norm in " yields the 
estimate 

E"||Kt)-^?||^-„ =e-*/^E"||z.(0)-^o"llff-^ + - f e-'''^'^'^^''\\F[v{s),s)^r\ul\s)\\^_^ 

Jo 

+ ^1 g lih) fo ■ (^•2=^) 

We deal with the terms in the right hand side of (|4.23p separately in the following such that we can 
apply Gronwall's inequality. Note that the term containing the initial condition vanishes due to the 
assumptions of the theorem. We start with the stochastic integrals in the subsequent part (b) of the 
proof. 

(b) As due to Jensen's inequality E|y| < ^Ejyp it makes sense to calculate the second moment 
of the stochastic integral in the right hand side. For the norm in H~°' we use = {(l),4>)ij-c, 

and thus obtain using the linearity of the inner product 



k=l " ' ^ ' fc=l 



l{k,n)l{j,n) '-^ " 

We next consider the individual expectations of the random terms 1/3^ „p and „ /?j „ in the right 
hand side. We have already stated that the stochastic integral with respect to the martingale (|4.18p 
is defined path-by-path as a Riemann-Stieltjes integral, see (|4.20p and, moreover, (|4.20p implies 
that the stochastic convolution integral can be written as a stochastic integral with respect to the 
fundamental martingale measure M" associated with the PDMP (6'",t)f>o, see [16| . i.e., 

/■*e-(*-^)/MM^" = / e-(*-'^)/^(c'' - e^^l") M"(ds,de) 

Jo -/[O.tlxNj" 

with predictable integrand 
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Then we obtain due to the Ito-isometry foUowing from [16, Prop. 4.6.2] using (|4.6p that 

< E" 1^ e-2(*-^)/- (i + i /(fc, n)7fe,„(n")) ds 

< i(E"©^''" + 2;(fc,n) ll/llo) j\-^^'-''>^'ds. 

It remains to consider the product for which we obtain due to the integration by parts 

formula 

pk.n^^j.n^ I pKn^pj^^n^ f ^J^^ d/?,^^" ^ , (4.24) 

Jo Jo 

where the square brackets denote the quadratic variation process. The expectation of each of the 
terms in the right hand side vanishes: The first two are stochastic integrals with respect to mar- 
tingales, hence martingales themselves which are identical to zero at the origin. Furthermore, as 
both martingales are cadlag with paths of finite variation on compacts, hence quadratic pure jump 
martingales, we obtain for the quadratic variation process 

s<t 

However, as all jump times of the two martingales a.s. differ it follows that [/j'^'", /?^'"] ^ = 0. 
Thus overall we have established that 

where 1/2 is an upper bound for i e^'^^^^^^^'^ds independent of £. Estimating the norm WId^ „ 11^^-° 
just as in the proof of Corollarv l2.1l we finally obtain that 

with r = 2a/d for < a < d/2, r = 1 - e for a = d/2 and r = 1 for a > d/2. 
(c) We next estimate the term 

/* e'^'-'^/^E"\\F{iy{s),s) ~ s) ||^_ 

Jo 

in (|4.23p . From part (b) of the proof of Theorem \2A\ in Section [iTTI it follows that 
E"||F(z.r,t)-F"(^r,t)||//-= < 3+{n)^^^(^^\{l + \\f\\o)\\\/.w\\L2^L- + \\V,I{t)\\L2) , (4.26) 

where F is the Nemyztkii operator defined in (|4.4p and K-a is a constant resulting from the contin- 
uous embedding of into H~°'. Here, the right hand side can be further estimated independently 
of t > using the assumption that ||Vx7(t)||^2 is uniformly bounded in t > 0. Furthermore we 
have shown in Section 14.21 in the proof of Corollary 12.11 that under the appropriate assumptions 
the Nemytzkii operator F is Lipschitz continuous on H^", Q > 0, with Lipschitz constant L-a > 
independent of t > 0, i.e., 

\\F{gi,t)-F{g2,t)\\H-. < L.a.\\gi-g2\\H-^ V5i,fl2GL'. (4.27) 
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A combination of the triangle inequality and the estimates (|4.26p and (I4.27P yields 

Jo 



t 



Overall, it thus follows from (|4.23p that 

IE"lk(i)-^rib-= <E"||KO)-^'?IIh-= +— re-(*-^/")E"||Ks)-^rilff-=ds 

^ Jo 

+o(s+{n) + 



Then an application of Gronwall's inequality yields 



By assumptions of the theorem the term in the right hand side converges to zero for n — >• oo. As this 
convergence is uniform in t it holds that 

lim supE"||i.(t) - ftWH-'^ = . (4.28) 
t>o 

4.4 Proof of Theorem [23] (Martingale central limit theorem) 

In order to prove the martingale central limit theorem we employ the general martingale central 
limit theorem [27' Thm. 5.1] for the Hilbert space H^", i.e., the dual of the Sobolev space H°', 
for Q > d. To apply this theorem it suffices to prove the following conditions. Subsequently we use 
pn = \/ £~ (n) /vjf. (n) to denote the rescaling sequence and use the notation 

(G"(t)0,0)H" = \{Yn [ ^"(c) - ^"(©r),0)i/. t^"{Yr,do (4.29) 



■'K 

which corresponds to the quadratic variation of the martingales (M")j>o, see [27; for a discussion. 
(CLTl) For aU t > it holds that 

supp„E" /VA"(n") / |l^"(e)-z^"(<9nil?f-=M"(n",dC)dsl <oo, (4.30) 

and there exists an orthonormal basis {^j)j<^n of H°'{D) such that for all j G N and t > 

p„E"(G"(Ft")<^„^,)H" < (4.31) 

where the constants 7^ > are independent of n and t, satisfy X]j>i 7j < ^^'^ 
constant C > is independent of n and k but may depend on t. 
(CLT2) The jump heights of the rescaled martingales are almost surely uniformly bounded, i.e., 
there exists a constant /? < 00 such that it holds almost surely for all n G N that 

sup Vp;;||i."(en-'^"(©^)IL-» </5- (4.32) 
t>o 

Further, for all (f> G H°' and alH > it holds that 

/*E"|(G(Ks))<^,0)^„ -/5„(G"(yr),^,<^)^„[ds = 0. (4.33) 
Vo 



ft 

lim 



n— >oo 
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On a technical level we note that the condition (CLTl) guarantees tightness of the sequence of 
rescaled martingales (ypn M")j>o in the Skorokhod space of cadlag functions in H^°'. This property 
is equivalent to relative compactness in the topology of weak convergence of measures and thus implies 
the existence of a convergent subsequence. The conditions (CLT2) are then sufficient to establish 
that any limit possesses the form of a diffusion process defined by the covariance operator C given 
in (|2.18p . In particular, condition (|4.33p precisely gives the convergence of the quadratic variations 
and is thus the central condition. In the subsequent two parts of the proof we show that they are 
satisfied: In part (a) we prove conditions (|4.30p and (|4.3H) and part (b) establishes (|4.32p and (|4.33p . 

(a) We first prove conditions (|4.30p and (|4.3ip . Here we also observe the significance of the 
choice of the norm in H~" with a > d for establishing the convergence, which is essentially that 
it guarantees the existence a Sobolev space H°'^ with continuous embeddings H°' ^ H°'^ ^ C!{D), 
where the first is of Hilbert-Schmidt type. For subsequent use we recall the estimates 

with a suitable constant Ka > 0, which we have already established in the proof of Corollarv l2.1l due 
to the Holder inequality and the Sobolev Embedding Theorem. Therefore we obtain for the term 
inside the expectation in (|4.30p the estimate 

Next taking the expectation, using the bound (|4.6|) on £"©8'" and integrating over [0,t] we obtain 
the estimate 



Jo L 



Multiplying both sides with pn = £-{n)/v+{n) we find that condition (|4.30|) is satisfied. 



We proceed to condition (|4.3ip and first of all expand the integrand to obtain 



l(k, n) 
fc=i ^ ' 

We next estimate the term (I/j^. ^ , (^)|fQ . Here we use the fact that for a function in L^{D) its 
application as an element of the dual H^" as well as H^"^ for any ai with < qi < a coincide. We 
choose Qi such that d/2 < qi < a — d/2 and obtain 

where Ka-^ is the constant resulting from the Sobolev Embedding Theorem. Next taking the expec- 
tation, estimating the expectation terms as before and multiplying by p„ yields 

p„E"{G"(Yr)^„^j)H« < \Kl^{l + \\fh)\\H>j\\l^.. 

We chose the constants in (|4.31|) as C := Kai{l + \\f\\o)/T and := . Finally, as due to 

Maurin's Theorem the embedding of the space H" into H"^ is of Hilbert-Schmidt type, cf. footnote 
[3]on p. 1121 it holds that X]j>i llv'jill/"! < Condition (|4.3ip is satisfied. 

(b) The estimates in part (a) further show that the jump sizes are almost surely uniformly 
bounded as 



sup^l|;/"(er)-;^"(©r-)|L-„ < 

t>0 



£_(n) 
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Here the upperbound in the right hand side converges to zero for n oo and thus the left hand side 
is bounded over aU 71 G N. Therefore condition (|4.32p holds and we are left to prove the convergence 
of the quadratic variation (|4.33(1 . For the jump process the quadratic variation satisfies 

p 

The quadratic variation of the limiting diffusion is given by 

{G{u{t),t)ij,,ij,)Ho. = l^(l>{xf(^^u{t,x) + ^f(^l^wix,y)u{t,y)dy + I{t,x))dx. 

Here the necessary estimates are split into several parts which are separately considered in the 
following. Afterwards, the estimates are combined to infer the convergence (|4.33|) . In the following 
we use again F as the Nemytzkii operator defined in (|4.4p . Hence, for the difference of the quadratic 
variations we obtain the estimate 

= ifi"! / <j)(xfu{t,x)+(j)(xfF(p{t),t){x)dx 
T I Jd 



k=l 

< y (l){xfu{t,x)+(l}{xfF{u{t),t){x)dx 



{x)\^{0l'){x) + <l,{xfF{i.'\0'^),t){x) dx 



1 



T 



+ -E"|/ <t>{xfu''{0l^){x) + <t>{xfF{u''{0't),t){x)dx (4.34) 



{iii) (iv) 



k=l 



{Hi) (iv) 



Using the triangle inequality once again for each of the two differences grouping the terms marked 
(j)-(iii) we obtain four terms which we subsequently estimate separately. Finally, in part (v) we 
combine the four estimates. 

(i) The first term is the simplest to estimate. Using the Cauchy-Schwarz inequality we obtain 

,j>''{x){uit,x) - ly^ix)) dx\ < mh ^"Mt) - Vt\\L- ■ (4.35) 
D ' 
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{ii) We next consider the difference arising from the terms marked (ii) and obtain using the 
Lipschitz condition p.4p on / and the Cauchy-Schwarz inequality twice 



■ID 



{xf(^F{u{t),t){x)-F{ur,t){x))dx\ <LE"y^|0(x)|2[ l^w{x,y){,.{t,y)-,.l'{y))dy\dx 

<LE" / \(b{x)\-'\\w{x,-)\\L2Mt)^unL^dx 



JD 



< Lu\\U\w\\L2^^L-r'\\u{t) - ui'W 



L2. 



(4.36) 



(iii) In order to estimate the next term we use the bound (|4.6p on E^^0^'"^ and thus obtain 



^ l{k,n) I 



k=l 



(xfdx ^" 



l{k, n) 



c) da; 



p 



+ (l + ll/llo)E|i?fe,„l|l- 



A;=l 



0(a;) da;^ 



< (i + II/IIo)E/ 

fc=l "^-^ 



1-D 



fc,Tl I J Dk 



'^(y)dy dx 



Then the estimate is completed applying the Poincare inequality (|4.ip to the first term, that is, 
estimating 



dia.m{Dk^r. 



\\^4>\\h, 



and the observation that the second term is proportional to || II i2 which is the piecewise constant 
approximation to based on the partition Vn, see (|4.2p . Therefore we overall obtain an upper bound 



for the difference constituted by the terms (iii) in (|4.34|) by 

P r^k,n 

i){x)'^h't{x) dx - pn 

Id 

In the last term 



< 5+(n)- 



^^Ji^"° ll<^llgi + (l + ll/llo)i?(n). 

(4.37) 



R{n) 



^ V-{n) £-{n) 
v+{n) e+{n) 



wrwh 



converges to zero for n — )■ cxd by assumption (|2.19p and as the sequence ||<^"||/,2 is bounded as it 
converges to ||<?I>||i,2 for n — )• oo. 

(iv) Finally we consider the difference 

p 



D 



(x)^F(^r, t)ix) dx - p„ J2 {iD,.„ A)i^ 



k=l 



^"E| L m^F{u^,t){x)dx-j^^f^^^{Yr){^,iD,yH 



fe=l ' -'D^ 
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We continue estimating the difference in each summand in the final right hand side and obtain using 
the triangle inequality for the term inside the expectation 



fe=i 



pn 



l{k, n) 



(*) 



We start with the first term and observe that it possesses the same structure as the term estimated 
in part (c) of the proof of Theorem 12.11 with the only difference that here the function 4> in the 
integrand is squared. Therefore we obtain the estimate, cf. (|4.15|) . 

(*) < S+{n) ^ 11^^^^' (^(1 + ll/l|o)||VxU,||^2^^. + ||V:.J(f)||^2) . 

Next, we estimate the second term. Note that /j. „ is bounded by ||/||o and thus the remaining 
term is just as in part {Hi) of the proof. Hence we obtain the estimate, cf. (|4.37|1 . 



/ N ^ r / n2 ||,/||o II ,||2 , ||.|| 1 v-{n) £-{n) 
(** < S+{n) ||</>||^i + ll/llo 1 7-(-^-7-f 



Therefore, we overall obtain an upper bound for the difference generated by the terms {iv) by 

p 



fc=l ^ ' ^ 

S+{n) ^^^^ (Vl^(l + ll/llo) llV.^||i.,i. + ||V./(t)||i.) + S+{nf Mi + H/Hq R(n), 



(4.38) 



where the term R{n) is as in (|4.37p . 

(v) To complete the proof we combine the estimates (|4.35p - (|4.38p to obtain 

E''\{G{u{t),t)^A)H^ -Pn{G"{t) 



<{1 + L Ml.^l.) ml. E"Mt) - ^.rilL^ + (1 + 2||/||o) R{n) 

+ S+in) ^^^Mi + ll/llo) ||V.»|L.,i.) + ||V./(t)|L.) +S+inf 



III.. 



Integrating over (0, T) we obtain with a suitable constant > independent of n and T the estimate 



I 

Jo 



(E"||i^(t) - /^niLi((o,T),L=) +TR{n) + r5(n)(l + ||Vx/(t)||ii((o,T),L^)) + 5{nf) 



The constant depends on the norm of <f) in the spaces and L"' where the latter can be estimated 
in terms of the norm in the Sobolev space _ff" due to the embedding '-^ L'^, i.e., is finite and 
depends only on <^ G _ff". Finally, each term in the right hand side converges to zero for n ^ oo and 
hence condition (|4.33p follows. The proof of Theorem l2.3l is completed. 

Acknowledgements: The authors thank J. Touboul for directing our attention also towards the 
infinite-time convergence in Theorem [221 
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A Well-posedness of the Wilson-Cowan equation 

This section provides a concise exposition, based on classical existence theory, of the well-posedness of the Wilson- 
Cowan equation 111.311 and the boundedness and regularity results for its solution as referred to in Section ll.il We 
understand equation (11.31 1 as an L^(D)— valued integral equation, i.e., 

u(t) = U0 + - J^^(^-u{s) + F(u{s),s)'jds t>0,uoeL^iD), (A.l) 

where the integral is a Bochner integral and F is the Nemytzkii operator acting on L^{D) defined by 

F(9,t){x) = f(^J^w{x,y)g(y)dy + I(t,x)) y g e L\D) . 

As in Section Fl.ll we assume that / ; R — ^ IR+ is Lipschitz continuous, w G L^(D X D) and / £ C(R+, L^(r')), 
which implies that F is continuous in t. Furthermore, it was shown in Section 14.11 that under these assumptions 
F{g,t) is Lipschitz continuous in the argument g with Lipschitz constant independent of t > 0. Thus the integrand 
in llA.H is Lipschitz continuous with respect to the L'^(D)-valued argument for all t > and, moreover, uniformly 
continuous in g with respect to t. It follows that the integrand in HA. Ill , that is, the map {g,t) — >■ —g + F{g,t), is 
jointly continuous on R-|_ X L^(D). Then |10l Thm. 5.1.1] implies that there exists a unique, strongly continuous, 
global solution to IIA.lll for every initial condition uq G L^{D). By definition this solution is absolutely continuous 
and, as F is jointly continuous, the derivative of the solution is continuous and exists everywhere. Thus, we conclude 
that there exists a unique continuously differentiable solution, i.e., v G (IR+, (D)). 

Next, we recall an 'explicit' representation of the solution is the variation of constants formula 114. 221 1 which 
we already stated in Section 14.31 We have that the solution of the Wilson-Cowan equation satisfies the integral 
equation 

u{t) = uo+ Au{t) + F{y(t),t)/Tdt, 
Jo 

where A is the linear operator in L^{D) mapping g to —g/r. Thus, the solution u satisfies 

u{t) = e^^uo + - r e(*-='^F(s, u{s)) ds Vi > 0. 
r Jo 

In the present setting the application of the linear operator e*^ corresponds to the scalar multiplication with e~'/'^ 
as A = — ■^Id^2 and thus 

1 /■* 

u(t) = e-'/^^o + - / e-(*-''''^F(s, u{s)) ds Vt > 0. 
T Jo 

We next discuss the results stated in Section 11.11 on the higher spatial regularity of solutions to l lA.lll . Then a 
pointwise bound on u{t) G L^{D), i.e., a constant C such that [^(t, a;)| < C for almost all x G -D and all t>0, are 
then easily obtained by an approximation argument, that is, approximating the less regular solution by solutions 
of higher regularity. It is possible to prove the pointwise bounds directly, see e.g., [26] for such an argumentation 
in a similar setting. However, it is easier and more illustrative to use available results for solutions of higher spatial 
regularity which are usually arising as the deterministic solution of IIA.lll one is interested in. E.g., the authors 
in |34l argue that from an application point of view it is reasonable to consider at least continuous solutions. In 
particular, the authors in 1341 present a detailed existence and uniqueness result for the activity based Amari mean 
field equation and state that an analogous result hold for the Wilson-Cowan equation l lA.lll for spatial dimensions 
d < 3 which covers all physical relevant domains. Concerning the spatial regularity they consider the space H°'{D), 
where a is set to be the smallest integer such that a > d/2. The significance of the choice of a > d/2 is - as so 
often in this study - that this implies the embedding of the space H'^{D) into C{D). Furthermore we then even 
obtain that C{[0,T], H'^ (D)) C C{[0,T] X D), i.e., the solution u is jointly continuous. 

Therefore we have the subsequent theorem which is sufficient for the set-up in this study. However we note that 
existence and uniqueness of solutions of the Amari equation were considered under less strict regularity assumptions 
on the coefficients in I24| and we conjecture that these are also valid for the Wilson-Cowan equation. 

Theorem A.l 1341 Sec. 2] The domain D is bounded and satisfies the strong local Lipschitz property. We assume 
that w G H°'{D X D), that f G C'^{D) with all derivatives bounded, and that I G C(K+, iT"(Z))). Then there exists 
a unique global solution v G C([0, T] , (_D)) /or every T > and every initial condition vo G H°'{D) to llA.lll 
which depends continuously on the initial condition and is continuously differentiable. Moreover the solution is 
globally bounded in H°' (D) if the externally applied current I is globally bounded. 

Remark A.l In the work |34l the authors assume for the domain only the cone property which is implied by the 
strong local Lipschitz property, see |2] p. 84]. The latter is the necessary boundary regularity for the present study, 
cf. footnote[3]on p. 1121 Furthermore, in the reference 1341 it is also assumed that the gain function / is infinitely often 
differentiable with bounded derivatives, but it is surely sufficient for / being a-times continuously differentiable. 
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Finally, it remains to show the pointwise bound u{t,x) g (0, ||/||o) if the initial condition satisfies t'o(^) G 
[0, ||/||o] proposed in Section ll.il Under Theorem lA.il the solution v{t,x) to llA.lll is jointly continuous and there- 
fore the Wilson-Cowan equation holds pointwise in x everywhere and for all t > 0. Furthermore, t i— > u{t, x) is 
continuously differentiable for every fixed x S D and it is immediate that the bounds are satisfied due to the fact 
that the derivative of the solution approaching or ||/||o becomes positive or negative, respectively. Now, using an 
approximation result of smooth solutions converging to the L^{D) solution we obtain that even in this less regular 
case the pointwise bounds hold almost everywhere. 



B Comparisons of moment equations 

In this section we discuss the moment equations for the L^(_D)-valued jump Markov processes v" = !^"(6'"). These 
can be derived from the corresponding moment equations of the jump Markov process (0")t>o taking values in 
N^. This process is analogous in structure to the usual model used in chemical reaction kinetics, cf. , e.g., 1211 . Thus 
we can use the formulae derived in this reference to obtain, e.g., for the mean the system of differential equations 

fc=i j=i 

Furthermore it is straightforward to state a system for the second moments, however, we are not so much interested 
in the moments of the Markov chain model but those of the L^(_D)-valued processes (i'")t>o which we can compare 
to the Langevin approximation. As v" is a linear mapping from into L^{D), it holds that u" (E"&") = E,"u"(&") 
and !/"(^E"0^) = |jE"i/"(et"), and thus 

—E"iy^ = -i E";/(" + - E"F"(iyj",t) . (B.2) 

dt T T 



For the second moments of the L-^(D)— valued process we obtain for all ij> G L^i^) 
-E"(0,,.?)2, =E"(^-1^ / ^{x)dx) 



1 ^ ^ 

-E" 





N 2 




4>{^s) dx J 





- E" [(0, v^)^^ (0, -v^ + F" ^2] + (G"(er, 0, 0)^2 , (B.3) 



where the bilinear form (^C^ {&" ,t) (f>, (f) is as defined in 114.2911 . 

Next, we state the moment equations for the stochastic partial differential equations. We assume that the 



Langevin approximation 112.2211 possesses a (strong) solution in an appropriate Hilbert space H and employ the 
Ito- formula | 12l Sec. 4.5] which yields for all <p £ H* 



{^,Vt)H = {^,Vo)H+e„ [\<t>,^G(Vs,s)dWs) + [\(t>4Vs + ^F{Vs,s)) 

Jo ' H Jo 



^ds 



and 



{<l>,VtfH = (0,Vb>H + e„ f\2(<f>,Vs)H^, VGiVs,s)dWs)„ 
Jo 

+ 2 [\<t>, Vs)h (0, -^Vs + ^ F(Vs,s))^ ds + 4 /*(</., G{Vs, s)<t,)^ ds . 
Jo Jo 

Next, we take the expectation both sides of these identities and differentiate with respect to t resulting for the first 
moment in the differential equations 

which is equivalent to the abstract evolution equation in H given by 

— EVt = --EVt + -EF{Vt,t). (B.4) 
dt r T 
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And for the second moment we obtain the differential equation 

^E(0,VOlf = \ ^[{'i>,Vt)H{<i>,~Vt+F{Vt,t))^] +4E[(G(yt,i)0,0)J . (B.5) 
Further, the linear noise approximation 1 12.211 1 satisfies the equations 

— Wt = ~-Wt + -^F{Ut,t). (B.6) 

At T T 

and 

A E(</., C/t)lf = '^^[{<i>,Ut)H{'t>,-Ut + F{Ut,t))^]+ei{G(u(t),t),i>,<t>)^. (B.7) 

Finally, we note that exactly the same moment equations hold for the variants of the linear noise and Langevin 
approximation using a Q-Wiener process and an appropriate diffusion coefficient, cf. Remark 1 2. 4 1 

A comparison of the moment equations IIB.lll . IIB.411 . IIB.611 for the mean and IIB.311 . l IB.SI l. IIB.71 1 for the second 
moments show that they are similar in structure but do not coincide. This is analogous to the properties of the 
moment equations in finite dimension and as in finite dimensions there is one exception, which is the case of first 
order transitions: If F were in D, i.e., F{v, t) = /i(i)-D + /2(i), then we obtain that the first moment equations 

||B.4|| and 1IB.6II of the Langevin and linear noise approximation, respectively, reduce to the Wilson-Cowan equation 
with u{t) = EVt = E,Ut- Furthermore, if F is affine, this implies that also G is affine in Vt and thus 

{<l,,G{Vt,t)<l,)„ = -{<l>,Vf<P)H + (<l>,Mt)-Vf<l>)„ + (<l>j2{t) ■<!,)„. (B.8) 
r 

Taking the expectation on both sides and assuming interchangeability of the expectation with the application of 
all the linear forms (think of the duality pairing as the inner product in L^(D)) we obtain 

E(G(Vt,t)<f>,4,)H = -{4>MVt] ■ 4>)h + {4>,h{t) ■ nVt] ■ <t>)H + {<f>j2{t) ■ 4>)h = {G{E[Vt],t) <t>,4>)H ■ (B.9) 

T 

As EVt = E?7t = v{t) we obtain that the second moment equation for the Langevin approximation and the linear 
noise approximation coincide. Moreover, they are closed (for each (/>), i.e., the system depends only on EVt and 
E((/), Vt)|j. Again, this corresponds to the well-known case from finite-dimensional chemical reaction kinetics. 

Finally, if F is affine also the connection of the moment equations for the resulting Markov chain models is 
interesting. On the one hand the equation for the mean coincides with the Wilson-Cowan equation where the 
gain function in its right hand side is given by F . As F is essentially a piecewise constant approximation to F 
the resulting equations for the mean correspond to a spatial discretisation of the Wilson-Cowan equation, cf. the 
continuum limit in the derivation of the mean field equation in [5]. 
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